From db209c95f987e48c4b33af543abd965c6b0a0d79 Mon Sep 17 00:00:00 2001 From: Anatol Belski Date: Wed, 13 Sep 2017 05:11:19 +0000 Subject: [PATCH] Reword the multibyte matching description git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@343081 c90b9560-bf6c-de11-be94-00142212c4b1 --- reference/parle/pattern.matching.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reference/parle/pattern.matching.xml b/reference/parle/pattern.matching.xml index c7bfbb62ed..8005154e9f 100644 --- a/reference/parle/pattern.matching.xml +++ b/reference/parle/pattern.matching.xml @@ -8,7 +8,7 @@ Parle supports regex matching similar to flex. Also supported are the following POSIX character sets: [:alnum:], [:alpha:], [:blank:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:] and [:xdigit:]. - The Unicode character classes are currently not supported. The underlying library supports it through UTF-32, at the moment it is not implemented. A particular encoding however is supported when the correctly constructed regex. For example, to match the EURO symbol encoded in UTF-8, the regular expression [\xE2][\x82][\xAC] can be used. Or in general, the regex matching a UTF-8 encoded string could be [ -\x7f]{+}[\x80-\xbf]{+}[\xc2-\xdf]{+}[\xe0-\xef]{+}[\xf0-\xff]{-}[\"\\\]|\\\([\"\\\/bfnrt]|u[0-9a-fA-F]{4}). + The Unicode character classes are currently not supported. A particular encoding can be mapped with a correctly constructed regex. For example, to match the EURO symbol encoded in UTF-8, the regular expression [\xe2][\x82][\xac] can be used. The pattern for an UTF-8 encoded string could be [ -\x7f]{+}[\x80-\xbf]{+}[\xc2-\xdf]{+}[\xe0-\xef]{+}[\xf0-\xff]{-}[\"\\\]|\\\([\"\\\/bfnrt]|u[0-9a-fA-F]{4}).
Character representations