diff --git a/reference/pcre/pattern.syntax.xml b/reference/pcre/pattern.syntax.xml index fd6fc427fc..d4c4e66755 100644 --- a/reference/pcre/pattern.syntax.xml +++ b/reference/pcre/pattern.syntax.xml @@ -1040,6 +1040,48 @@ terminator is always special and must be escaped when used within an expression. + + Perl supports the POSIX notation for character classes. This uses names + enclosed by [: and :] within the enclosing square brackets. PCRE also + supports this notation. For example, [01[:alpha:]%] + matches "0", "1", any alphabetic character, or "%". The supported class + names are: + + Character classes + + + alnumletters and digits + alphaletters + asciicharacter codes 0 - 127 + blankspace or tab only + cntrlcontrol characters + digitdecimal digits (same as \d) + graphprinting characters, excluding space + lowerlower case letters + printprinting characters, including space + punctprinting characters, excluding letters and digits + spacewhite space (not quite the same as \s) + upperupper case letters + word"word" characters (same as \w) + xdigithexadecimal digits + + +
+ The space characters are HT (9), LF (10), VT (11), FF (12), CR (13), + and space (32). Notice that this list includes the VT character (code + 11). This makes "space" different to \s, which does not include VT (for + Perl compatibility). +
+ + The name word is a Perl extension, and blank is a GNU extension + from Perl 5.8. Another Perl extension is negation, which is indicated + by a ^ character after the colon. For example, + [12[:^digit:]] matches "1", "2", or any non-digit. + + + In UTF-8 mode, characters with values greater than 128 do not match any + of the POSIX character classes. +