Regex Character Classes
In regular expressions, a character class is one or more characters enclosed in square brackets []. Character classes match any one character from the set of characters specified within the brackets.
Character classes make it easy to match a specific set of characters or exclude certain characters from a match. For example, the regular expression [aeiou]
matches any vowel, while [^aeiou]
matches any character that is not a vowel.
You can also use character classes to match sets of characters, such as [a-z]
to match any lowercase letter or [0-9]
to match any digit.
Character classes can be combined with other regular expression metacharacters, such as quantifiers and anchors, to create more complex patterns.
Note: Character classes only match a single character at a time, so if you want to match multiple characters in a row, you'll need to use a quantifier like +
or *
.
The followings are common examples of character classes:
- [abc] - matches any one character that is either 'a', 'b', or 'c'.
- [a-z] - matches any one lowercase letter from 'a' to 'z'.
- [A-Z] - matches any one upper case letter from 'A' to 'Z'.
- [0-9] - matches any one digit from '0' to '9'. Optionaly, use \d metacharacter.
- [^abc] - matches any one character that is not 'a', 'b', or 'c'.
- [\w] - matches any one-word character, including letters, digits, and underscore.
- [\s] - matches any whitespace character, including space, tab, and newline.
- [^a-z] - matches any one character that is not a lowercase letter from 'a' to 'z'.
The following demonstrates character classes:
The regex pattern [eor] finds all the occurrences of ‘e’, ‘o’, and ‘r’ in the text.
The regex pattern [^eor] finds all the occurrences of characters except ‘e’, ‘o’, and ‘r’ in the text.
The regex pattern [m-r] finds all the occurrences of characters that come between small m and r, that is m,n,o,p,q,r in the text.
The regex pattern [\w] finds all the occurrences of letters, digits, and underscore.
The regex pattern [\s] finds all the occurrences of white space, tab, and newline characters.
The regex pattern [^a-z]
finds all the occurrences of a character that is not in a lowercase letter.
Note that we can combine one or more quantifier characters with other elements in the regex pattern.
Overall, character classes are a powerful tool in regular expressions that allow for precise and flexible pattern matching.