From 3ba4f8042613365acc6d802f021e3fa425fa98f2 Mon Sep 17 00:00:00 2001 From: Anatol Belski Date: Tue, 12 Sep 2017 19:36:13 +0000 Subject: [PATCH] Add encoding note git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@343080 c90b9560-bf6c-de11-be94-00142212c4b1 --- reference/parle/pattern.matching.xml | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/reference/parle/pattern.matching.xml b/reference/parle/pattern.matching.xml index a129526edc..c7bfbb62ed 100644 --- a/reference/parle/pattern.matching.xml +++ b/reference/parle/pattern.matching.xml @@ -4,7 +4,12 @@ Parle pattern matching Pattern matching - Parle supports regex matching similar to flex. Also supported are the following POSIX character sets: [:alnum:], [:alpha:], [:blank:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:] and [:xdigit:]. + + Parle supports regex matching similar to flex. Also supported are the following POSIX character sets: [:alnum:], [:alpha:], [:blank:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:] and [:xdigit:]. + + + The Unicode character classes are currently not supported. The underlying library supports it through UTF-32, at the moment it is not implemented. A particular encoding however is supported when the correctly constructed regex. For example, to match the EURO symbol encoded in UTF-8, the regular expression [\xE2][\x82][\xAC] can be used. Or in general, the regex matching a UTF-8 encoded string could be [ -\x7f]{+}[\x80-\xbf]{+}[\xc2-\xdf]{+}[\xe0-\xef]{+}[\xf0-\xff]{-}[\"\\\]|\\\([\"\\\/bfnrt]|u[0-9a-fA-F]{4}). +
Character representations