More on pattern matching in parle

git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@343039 c90b9560-bf6c-de11-be94-00142212c4b1
This commit is contained in:
Anatol Belski 2017-09-10 09:29:52 +00:00
parent 9020e196f8
commit 7a307787b1

View file

@ -5,7 +5,7 @@
<title>Parle pattern matching</title>
<titleabbrev>Pattern matching</titleabbrev>
<para>Parle supports a simplified regex matching. The supported syntax is the subset of the features defined by PCRE.</para>
<section xml:id="parle.regex.syntax">
<section xml:id="parle.regex.chars">
<title>Character representations</title>
<para>
<table>
@ -24,7 +24,7 @@
<entry>\b</entry><entry>Backspace.</entry>
</row>
<row>
<entry>\e</entry><entry>ESC, \x1b.</entry>
<entry>\e</entry><entry>ESC character, \x1b.</entry>
</row>
<row>
<entry>\n</entry><entry>Newline.</entry>
@ -55,6 +55,166 @@
</table>
</para>
</section>
<section xml:id="parle.regex.charclass">
<title>Character Classes</title>
<para>
<table>
<title>Character classes</title>
<tgroup cols="2">
<thead>
<row>
<entry>Sequence</entry><entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>[...]</entry><entry>A single character listed or contained within a listed range. Ranges can be combined with the <literal>{+}</literal> and <literal>{-}</literal> operators. For example <literal>[a-z]{+}[0-9]</literal> is the same as <literal>[0-9a-z]</literal> and <literal>[a-z]{-}[aeiou]</literal> is the same as <literal>[b-df-hj-np-tv-z]</literal>.</entry>
</row>
<row>
<entry>[^...]</entry><entry>A single character not listed and not contained within a listed range.</entry>
</row>
<row>
<entry>.</entry><entry>Any character, <literal>[^\n]</literal></entry>
</row>
<row>
<entry>\d</entry><entry>Digit character, <literal>[0-9]</literal>.</entry>
</row>
<row>
<entry>\D</entry><entry>Non-digit character, <literal>[^0-9]</literal>.</entry>
</row>
<row>
<entry>\s</entry><entry>White space character, <literal>[ \t\n\r\f\v]</literal>.</entry>
</row>
<row>
<entry>\S</entry><entry>Non-white space character, <literal>[^ \t\n\r\f\v]</literal>.</entry>
</row>
<row>
<entry>\w</entry><entry>Word caracter, <literal>[a-zA-Z0-9_]</literal>.</entry>
</row>
<row>
<entry>\W</entry><entry>Non-word caracter, <literal>[^a-zA-Z0-9_]</literal>.</entry>
</row>
</tbody>
</tgroup>
</table>
</para>
</section>
<section xml:id="parle.regex.alternation">
<title>Alternation and repetition</title>
<para>
<table>
<title>Alternation and repetition</title>
<tgroup cols="3">
<thead>
<row>
<entry>Sequence</entry><entry>Greedy</entry><entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>...|...</entry><entry>-</entry><entry>Try subpatterns in alternation.</entry>
</row>
<row>
<entry>*</entry><entry>yes</entry><entry>Match 0 or more times.</entry>
</row>
<row>
<entry>+</entry><entry>yes</entry><entry>Match 1 or more times.</entry>
</row>
<row>
<entry>?</entry><entry>yes</entry><entry>Match 0 or 1 times.</entry>
</row>
<row>
<entry>{n}</entry><entry>no</entry><entry>Match exactly n times.</entry>
</row>
<row>
<entry>{n,}</entry><entry>yes</entry><entry>Match at least n times.</entry>
</row>
<row>
<entry>{n,m}</entry><entry>yes</entry><entry>Match at least n times but no more than m times.</entry>
</row>
<row>
<entry>*?</entry><entry>no</entry><entry>Match 0 or more times.</entry>
</row>
<row>
<entry>+?</entry><entry>no</entry><entry>Match 1 or more times.</entry>
</row>
<row>
<entry>??</entry><entry>no</entry><entry>Match 0 or 1 times.</entry>
</row>
<row>
<entry>{n,}?</entry><entry>no</entry><entry>Match at least n times.</entry>
</row>
<row>
<entry>{n,m}?</entry><entry>no</entry><entry>Match at least n times but no more than m times.</entry>
</row>
<row>
<entry>{MACRO}</entry><entry>-</entry><entry>Include the regex MACRO in the current regex.</entry>
</row>
</tbody>
</tgroup>
</table>
</para>
</section>
<section xml:id="parle.regex.anchors">
<title>Anchors</title>
<para>
<table>
<title>Anchors</title>
<tgroup cols="2">
<thead>
<row>
<entry>Sequence</entry><entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>^</entry><entry>Start of string or after a newline.</entry>
</row>
<row>
<entry>$</entry><entry>End of string or before a newline.</entry>
</row>
</tbody>
</tgroup>
</table>
</para>
</section>
<section xml:id="parle.regex.grouping">
<title>Grouping</title>
<para>
<table>
<title>Grouping</title>
<tgroup cols="2">
<thead>
<row>
<entry>Sequence</entry><entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>(...)</entry><entry>Group a regular expression to override default operator precedence.</entry>
</row>
<row>
<entry>(?r-s:pattern)</entry>
<entry>
Apply option r and omit option s while interpreting pattern. Options may be zero or more of the characters i, s, or x.
i means case-insensitive. -i means case-sensitive.
s alters the meaning of '.' to match any character whatsoever. -s alters the meaning of '.' to match any character except '\n'
x ignores comments and whitespace in patterns. Whitespace is ignored unless it is backslash-escaped, contained within ""s, or appears inside a character range.
These options can be applied globally at the rules level by passing a combination of the bit flags to the lexer.
</entry>
</row>
<row>
<entry>(?# comment )</entry><entry>Omit everything within (). The first ) character encountered ends the pattern. It is not possible for the comment to contain a ) character. The comment may span lines.</entry>
</row>
</tbody>
</tgroup>
</table>
</para>
</section>
</chapter>
<!-- Keep this comment at the end of the file