35.3.1.2 Character Classes

Below is a table of the classes you can use in a bracket expression (see bracket expression), and what they mean. Note that the '[' and ']' characters that enclose the class name are part of the name, so a regular expression using these classes needs one more pair of brackets. For example, a regular expression matching a sequence of one or more letters and digits would be '[[:alnum:]]+', not '[:alnum:]+'.

'`[:ascii:]`'

This matches any ASCII character (codes 0--127).

'`[:alnum:]`'

This matches any letter or digit. For multibyte characters, it matches characters whose Unicode 'general-category' property (see Character Properties) indicates they are alphabetic or decimal number characters.

'`[:alpha:]`'

This matches any letter. For multibyte characters, it matches characters whose Unicode 'general-category' property (see Character Properties) indicates they are alphabetic characters.

'`[:blank:]`'

This matches horizontal whitespace, as defined by Annex C of the Unicode Technical Standard #18. In particular, it matches spaces, tabs, and other characters whose Unicode 'general-category' property (see Character Properties) indicates they are spacing separators. (If you only need to look for ASCII whitespace characters, we suggest using an explicit set of character alternatives, such as '[ \t]', instead, as it will be faster than [[:blank:]].)

'`[:cntrl:]`'

This matches any character whose code is in the range 0--31.

'`[:digit:]`'

This matches '0' through '9'. Thus, '[-+[:digit:]]' matches any digit, as well as '+' and '-'.

'`[:graph:]`'

This matches graphic characters---everything except spaces, ASCII and non-ASCII control characters, surrogates, and codepoints unassigned by Unicode, as indicated by the Unicode 'general-category' property (see Character Properties).

'`[:lower:]`'

This matches any lower-case letter, as determined by the current case table (see The Case Table). If case-fold-search is non-nil, this also matches any upper-case letter. Note that a buffer can have its own local case table different from the default one.

'`[:multibyte:]`'

This matches any multibyte character (see Text Representations).

'`[:nonascii:]`'

This matches any non-ASCII character.

'`[:print:]`'

This matches any printing character---either spaces or graphic characters matched by '[:graph:]'.

'`[:punct:]`'

This matches any punctuation character. (At present, for multibyte characters, it matches anything that has non-word syntax, and thus its exact definition can vary from one major mode to another, since the syntax of a character depends on the major mode.)

'`[:space:]`'

This matches any character that has whitespace syntax (see Table of Syntax Classes). Note that the syntax of a character, and thus which characters are considered "whitespace", depends on the major mode.

'`[:unibyte:]`'

This matches any unibyte character (see Text Representations).

'`[:upper:]`'

This matches any upper-case letter, as determined by the current case table (see The Case Table). If case-fold-search is non-nil, this also matches any lower-case letter. Note that a buffer can have its own local case table different from the default one.

'`[:word:]`'

This matches any character that has word syntax (see Table of Syntax Classes). Note that the syntax of a character, and thus which characters are considered "word-constituent", depends on the major mode.

'`[:xdigit:]`'

This matches the hexadecimal digits: '0' through '9', 'a' through 'f' and 'A' through 'F'.

The classes '[:space:]', '[:word:]' and '[:punct:]' use the syntax-table of the current buffer but not any overriding syntax text properties (see Syntax Properties).

'[:ascii:]'​

'[:alnum:]'​

'[:alpha:]'​

'[:blank:]'​

'[:cntrl:]'​

'[:digit:]'​

'[:graph:]'​

'[:lower:]'​

'[:multibyte:]'​

'[:nonascii:]'​

'[:print:]'​

'[:punct:]'​

'[:space:]'​

'[:unibyte:]'​

'[:upper:]'​

'[:word:]'​

'[:xdigit:]'​