Added information, fixed large lines.

git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@247722 c90b9560-bf6c-de11-be94-00142212c4b1
2025-03-15 16:38:54 +00:00 · 2007-12-06 22:54:23 +00:00 · 2007-12-06 22:54:23 +00:00 · 77d57327f0
commit 77d57327f0
parent 6e0001af5e
1 changed files with 62 additions and 43 deletions
--- a/reference/pcre/pattern.syntax.xml
+++ b/reference/pcre/pattern.syntax.xml
@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.18 $ -->
+<!-- $Revision: 1.19 $ -->
 <!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
  <refentry xml:id="reference.pcre.pattern.syntax" xmlns="http://docbook.org/ns/docbook">
   <refnamediv>
@ -80,8 +80,8 @@
      </listitem>
      <listitem>
      <simpara>
-       Fairly obviously, PCRE does not support the (?{code})
-       construction.
+       Fairly obviously, PCRE does not support the (?{code}) and (??{code})
+       construction. However, there is support for recursive  patterns.
      </simpara>
      </listitem>
      <listitem>
@ -658,18 +658,18 @@
      </varlistentry>
     </variablelist>
     <para>
-      The property names represented by <literal>xx</literal> above are limited to the Unicode
-      general category properties. Each character has exactly one such
-      property, specified by a two-letter abbreviation. For compatibility with
+      The property names represented by <literal>xx</literal> above are limited 
+      to the Unicode general category properties. Each character has exactly one 
+      such property, specified by a two-letter abbreviation. For compatibility with
      Perl, negation can be specified by including a circumflex between the
-      opening brace and the property name. For example, <literal>\p{^Lu}</literal> is the same
-      as <literal>\P{Lu}</literal>.
+      opening brace and the property name. For example, <literal>\p{^Lu}</literal> 
+      is the same as <literal>\P{Lu}</literal>.
     </para>
     <para>
-      If only one letter is specified with <literal>\p</literal> or <literal>\P</literal>, it includes all the
-      properties that start with that letter. In this case, in the absence of
-      negation, the curly brackets in the escape sequence are optional; these
-      two examples have the same effect:
+      If only one letter is specified with <literal>\p</literal> or 
+      <literal>\P</literal>, it includes all the properties that start with that
+      letter. In this case, in the absence of negation, the curly brackets in the 
+      escape sequence are optional; these two examples have the same effect:
     </para>
     <literallayout>
      \p{L}
@ -728,9 +728,9 @@
      For example, <literal>\p{Lu}</literal> always matches only upper case letters.
     </para>
     <para>
-      The <literal>\X</literal> escape matches any number of Unicode characters that form an
-      extended Unicode sequence. <literal>\X</literal> is equivalent to
-      <literal>(?>\PM\pM*)</literal>.
+      The <literal>\X</literal> escape matches any number of Unicode characters 
+      that form an extended Unicode sequence. <literal>\X</literal> is equivalent 
+      to <literal>(?>\PM\pM*)</literal>.
     </para>
     <para>
      That is, it matches a character without the "mark" property, followed
@ -741,8 +741,9 @@
     <para>
      Matching characters by Unicode property is not fast, because PCRE has
      to search a structure that contains data for over fifteen thousand
-      characters. That is why the traditional escape sequences such as <literal>\d</literal> and
-      <literal>\w</literal> do not use Unicode properties in PCRE.
+      characters. That is why the traditional escape sequences such as 
+      <literal>\d</literal> and <literal>\w</literal> do not use Unicode properties 
+      in PCRE.
     </para>
    </refsect2>

@ -801,7 +802,8 @@
      Note that the sequences \A, \Z, and \z can be used to  match
      the  start  and end of the subject in both modes, and if all
      branches of a pattern start with \A is it  always  anchored,
-      whether <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>  is set or not.
+      whether <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>  
+      is set or not.
     </para>
    </refsect2>

@ -940,8 +942,8 @@
      <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>,
      <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>,
      <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link>,
-      and  <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>
-      can be changed from within the pattern by
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>
+      and PCRE_DUPNAMES can be changed from within the pattern by
      a sequence of Perl option letters enclosed between "(?"  and
      ")". The option letters are:

@ -973,6 +975,10 @@
          <entry><literal>X</literal></entry>
          <entry>for <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link></entry>
         </row>
+         <row>
+          <entry><literal>J</literal></entry>
+          <entry>for <link linkend="reference.pcre.pattern.modifiers">PCRE_INFO_JCHANGED</link></entry>
+         </row>
        </tbody>
       </tgroup>
      </table>
@ -1021,7 +1027,8 @@
      compile  time. There would be some very weird behaviour otherwise.
     </para>
     <para>
-      The PCRE-specific options <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>  and  
+      The PCRE-specific options <link 
+      linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>  and  
      <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link>   can
      be changed in the same way as the Perl-compatible options by
      using the characters U and X  respectively.  The  (?X)  flag
@ -1106,9 +1113,9 @@
     
     <para>
      It is possible to name the subpattern with
-      <literal>(?P&lt;name&gt;pattern)</literal> since PHP 4.3.3. Array with matches will
-      contain the match indexed by the string alongside the match indexed by
-      a number, then.
+      <literal>(?P&lt;name&gt;pattern)</literal> since PHP 4.3.3. Array with 
+      matches will contain the match indexed by the string alongside the match 
+      indexed by a number, then.
     </para>
    </refsect2>

@ -1237,7 +1244,8 @@
     that is the only way the rest of the pattern matches.
    </para>
    <para>
-     If the <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>  option is set (an option which  is  not
+     If the <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>  
+     option is set (an option which  is  not
     available  in  Perl)  then the quantifiers are not greedy by
     default, but individual ones can be made greedy by following
     them  with  a  question mark. In other words, it inverts the
@ -1248,7 +1256,8 @@
     as many characters as possible and don't return to match the rest of the
     pattern. Thus <literal>.*abc</literal> matches "aabc" but
     <literal>.*+abc</literal> doesn't because <literal>.*+</literal> eats the
-     whole string. Possessive quantifiers can be used to speed up processing since PHP 4.3.3.
+     whole string. Possessive quantifiers can be used to speed up processing 
+     since PHP 4.3.3.
    </para>
    <para>
     When a parenthesized subpattern is quantified with a minimum
@ -1257,7 +1266,8 @@
     proportion to the size of the minimum or maximum.
    </para>
    <para>
-     If a pattern starts with .* or  .{0,}  and  the  <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> 
+     If a pattern starts with .* or  .{0,}  and  the  <link 
+     linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> 
     option (equivalent to Perl's /s) is set, thus allowing the .
     to match newlines, then the pattern is implicitly  anchored,
     because whatever follows will be tried against every character
@ -1265,7 +1275,9 @@
     retrying  the overall match at any position after the first.
     PCRE treats such a pattern as though it were preceded by \A.
     In  cases where it is known that the subject string contains
-     no newlines, it is worth setting <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>  when  the  pattern begins with .* in order to
+     no newlines, it is worth setting <link 
+     linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>  when  the  
+     pattern begins with .* in order to
     obtain this optimization, or
     alternatively using ^ to indicate anchoring explicitly.
    </para>
@ -1337,8 +1349,9 @@
     following the backslash are taken as  part  of  a  potential
     back reference number. If the pattern continues with a digit
     character, then some delimiter must be used to terminate the
-     back reference. If the <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>  option is set, this can
-     be whitespace.  Otherwise an empty comment can be used.
+     back reference. If the <link 
+     linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>  option 
+     is set, this can be whitespace.  Otherwise an empty comment can be used.
    </para>
    <para>
     A back reference that occurs inside the parentheses to which
@ -1360,8 +1373,8 @@
     <para>
      Back references to the named subpatterns can be achieved by
      <literal>(?P=name)</literal> or, since PHP 5.2.4, also by
-      <literal>\k&lt;name&gt;</literal>, <literal>\k'name'</literal> or
-      <literal>\k{name}</literal>.
+      <literal>\k&lt;name&gt;</literal>, <literal>\k'name'</literal>,
+      <literal>\k{name}</literal> or <literal>\g{name}</literal>.
     </para>
    </refsect2>

@ -1636,8 +1649,9 @@
     condition is satisfied if the capturing subpattern  of  that
     number  has  previously matched. Consider the following pattern,
     which contains non-significant white space to make  it
-     more  readable  (assume  the  <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>   option)  and to
-     divide it into three parts for ease of discussion:
+     more  readable  (assume  the  <link 
+     linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> 
+     option)  and to divide it into three parts for ease of discussion:

       <literal>( \( )?    [^()]+    (?(1) \) )</literal>
    </para>
@ -1693,9 +1707,10 @@
     comment play no part in the pattern matching at all.
    </para>
    <para>
-     If the <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>  option is set, an unescaped # character
-     outside  a character class introduces a comment that
-     continues up to the next newline character in the pattern.
+     If the <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>  
+     option is set, an unescaped # character outside  a character class 
+     introduces a comment that continues up to the next newline character 
+     in the pattern.
     </para>
    </refsect2>

@ -1763,9 +1778,10 @@
     </para>
     
     <para>
-      Since PHP 4.3.3, <literal>(?1)</literal>, <literal>(?2)</literal> and so on can be used
-      for recursive subpatterns too. It is also possible to use named
-      subpatterns: <literal>(?P&lt;name&gt;foo)</literal>.
+      Since PHP 4.3.3, <literal>(?1)</literal>, <literal>(?2)</literal> and so on 
+      can be used for recursive subpatterns too. It is also possible to use named
+      subpatterns: <literal>(?P&gt;name)</literal> or 
+      <literal>(?P&amp;name)</literal>.
     </para>
     <para>
      If the syntax for a recursive subpattern reference (either by number or
@ -1803,10 +1819,12 @@
     regular expressions for efficient performance.
    </para>
    <para>
-     When a pattern begins with .* and the <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>  option  is
+     When a pattern begins with .* and the <link 
+     linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>  option  is
     set,  the  pattern  is implicitly anchored by PCRE, since it
     can match only at the start of a subject string. However, if
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>   is not set, PCRE cannot make this optimization,
+     <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>   
+     is not set, PCRE cannot make this optimization,
     because the . metacharacter does not then match  a  newline,
     and if the subject string contains newlines, the pattern may
     match from the character immediately following one  of  them
@ -1822,7 +1840,8 @@
    <para>
     If you are using such a pattern with subject strings that do
     not  contain  newlines,  the best performance is obtained by
-     setting <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>, or starting the  pattern  with  ^.*  to
+     setting <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>,
+     or starting the  pattern  with  ^.*  to
     indicate  explicit anchoring. That saves PCRE from having to
     scan along the subject looking for a newline to restart at.
    </para>