mirror of
https://github.com/sigmasternchen/php-doc-en
synced 2025-03-16 00:48:54 +00:00
There were forgotten division-marks inside words (Jakub Vrana)
git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@145358 c90b9560-bf6c-de11-be94-00142212c4b1
This commit is contained in:
parent
001059fd7f
commit
73dbf9a98f
1 changed files with 79 additions and 79 deletions
|
@ -1,5 +1,5 @@
|
|||
<?xml version="1.0" encoding="iso-8859-1"?>
|
||||
<!-- $Revision: 1.5 $ -->
|
||||
<!-- $Revision: 1.6 $ -->
|
||||
<!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
|
||||
<refentry id="pcre.pattern.syntax">
|
||||
<refnamediv>
|
||||
|
@ -47,10 +47,10 @@
|
|||
</listitem>
|
||||
<listitem>
|
||||
<simpara>
|
||||
Capturing subpatterns that occur inside negative looka-
|
||||
head assertions are counted, but their entries in the
|
||||
offsets vector are never set. Perl sets its numerical vari-
|
||||
ables from any such patterns that are matched before the
|
||||
Capturing subpatterns that occur inside negative
|
||||
lookahead assertions are counted, but their entries in the
|
||||
offsets vector are never set. Perl sets its numerical
|
||||
variables from any such patterns that are matched before the
|
||||
assertion fails to match something (thereby succeeding), but
|
||||
only if the negative lookahead assertion contains just one
|
||||
branch.
|
||||
|
@ -68,8 +68,8 @@
|
|||
<simpara>
|
||||
The following Perl escape sequences are not supported:
|
||||
\l, \u, \L, \U, \E, \Q. In fact these are implemented by
|
||||
Perl's general string-handling and are not part of its pat-
|
||||
tern matching engine.
|
||||
Perl's general string-handling and are not part of its
|
||||
pattern matching engine.
|
||||
</simpara>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
@ -123,7 +123,7 @@
|
|||
<simpara>
|
||||
If <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link> is set and
|
||||
<link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> is not
|
||||
set, the $ meta- character matches only at the very end of
|
||||
set, the $ meta-character matches only at the very end of
|
||||
the string.
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
@ -135,8 +135,8 @@
|
|||
</listitem>
|
||||
<listitem>
|
||||
<simpara>
|
||||
If <link linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link> is set, the greediness of the repeti-
|
||||
tion quantifiers is inverted, that is, by default they are
|
||||
If <link linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link> is set, the greediness of the
|
||||
repetition quantifiers is inverted, that is, by default they are
|
||||
not greedy, but if followed by a question mark they are.
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
@ -152,8 +152,8 @@
|
|||
<refsect2 id="regexp.introduction">
|
||||
<title>Introduction</title>
|
||||
<para>
|
||||
The syntax and semantics of the regular expressions sup-
|
||||
ported by PCRE are described below. Regular expressions are
|
||||
The syntax and semantics of the regular expressions
|
||||
supported by PCRE are described below. Regular expressions are
|
||||
also described in the Perl documentation and in a number of
|
||||
other books, some of which have copious examples. Jeffrey
|
||||
Friedl's "Mastering Regular Expressions", published by
|
||||
|
@ -162,8 +162,8 @@
|
|||
|
||||
A regular expression is a pattern that is matched against a
|
||||
subject string from left to right. Most characters stand for
|
||||
themselves in a pattern, and match the corresponding charac-
|
||||
ters in the subject. As a trivial example, the pattern
|
||||
themselves in a pattern, and match the corresponding
|
||||
characters in the subject. As a trivial example, the pattern
|
||||
<literal>The quick brown fox</literal>
|
||||
matches a portion of a subject string that is identical to
|
||||
itself.
|
||||
|
@ -173,9 +173,9 @@
|
|||
<title>Meta-characters</title>
|
||||
<para>
|
||||
The power of regular expressions comes from the
|
||||
ability to include alternatives and repetitions in the pat-
|
||||
tern. These are encoded in the pattern by the use of <emphasis>meta</emphasis>-
|
||||
<emphasis>characters</emphasis>, which do not stand for themselves but instead
|
||||
ability to include alternatives and repetitions in the
|
||||
pattern. These are encoded in the pattern by the use of
|
||||
<emphasis>meta-characters</emphasis>, which do not stand for themselves but instead
|
||||
are interpreted in some special way.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -299,8 +299,8 @@
|
|||
</variablelist>
|
||||
|
||||
Part of a pattern that is in square brackets is called a
|
||||
"character class". In a character class the only meta-
|
||||
characters are:
|
||||
"character class". In a character class the only
|
||||
meta-characters are:
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term><emphasis>\</emphasis></term>
|
||||
|
@ -350,23 +350,23 @@
|
|||
</para>
|
||||
<para>
|
||||
For example, if you want to match a "*" character, you write
|
||||
"\*" in the pattern. This applies whether or not the follow-
|
||||
ing character would otherwise be interpreted as a meta-
|
||||
character, so it is always safe to precede a non-alphanumeric
|
||||
with "\" to specify that it stands for itself. In particu-
|
||||
lar, if you want to match a backslash, you write "\\".
|
||||
"\*" in the pattern. This applies whether or not the
|
||||
following character would otherwise be interpreted as a
|
||||
meta-character, so it is always safe to precede a non-alphanumeric
|
||||
with "\" to specify that it stands for itself. In
|
||||
particular, if you want to match a backslash, you write "\\".
|
||||
</para>
|
||||
<para>
|
||||
If a pattern is compiled with the <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option, whi-
|
||||
tespace in the pattern (other than in a character class) and
|
||||
If a pattern is compiled with the <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option,
|
||||
whitespace in the pattern (other than in a character class) and
|
||||
characters between a "#" outside a character class and the
|
||||
next newline character are ignored. An escaping backslash
|
||||
can be used to include a whitespace or "#" character as part
|
||||
of the pattern.
|
||||
</para>
|
||||
<para>
|
||||
A second use of backslash provides a way of encoding non-
|
||||
printing characters in patterns in a visible manner. There
|
||||
A second use of backslash provides a way of encoding
|
||||
non-printing characters in patterns in a visible manner. There
|
||||
is no restriction on the appearance of non-printing characters,
|
||||
apart from the binary zero that terminates a pattern,
|
||||
but when a pattern is being prepared by text editing, it is
|
||||
|
@ -569,8 +569,8 @@
|
|||
</variablelist>
|
||||
</para>
|
||||
<para>
|
||||
Note that octal values of 100 or greater must not be intro-
|
||||
duced by a leading zero, because no more than three octal
|
||||
Note that octal values of 100 or greater must not be
|
||||
introduced by a leading zero, because no more than three octal
|
||||
digits are ever read.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -581,8 +581,8 @@
|
|||
class it has a different meaning (see below).
|
||||
</para>
|
||||
<para>
|
||||
The third use of backslash is for specifying generic charac-
|
||||
ter types:
|
||||
The third use of backslash is for specifying generic
|
||||
character types:
|
||||
</para>
|
||||
<para>
|
||||
<variablelist>
|
||||
|
@ -647,8 +647,8 @@
|
|||
Perl "<literal>word</literal>". The definition of letters and digits is
|
||||
controlled by PCRE's character tables, and may vary if locale-specific
|
||||
matching is taking place (see "Locale support"
|
||||
above). For example, in the "fr" (French) locale, some char-
|
||||
acter codes greater than 128 are used for accented letters,
|
||||
above). For example, in the "fr" (French) locale, some
|
||||
character codes greater than 128 are used for accented letters,
|
||||
and these are matched by <literal>\w</literal>.
|
||||
</para>
|
||||
<para>
|
||||
|
@ -659,8 +659,8 @@
|
|||
is no character to match.
|
||||
</para>
|
||||
<para>
|
||||
The fourth use of backslash is for certain simple asser-
|
||||
tions. An assertion specifies a condition that has to be met
|
||||
The fourth use of backslash is for certain simple
|
||||
assertions. An assertion specifies a condition that has to be met
|
||||
at a particular point in a match, without consuming any
|
||||
characters from the subject string. The use of subpatterns
|
||||
for more complicated assertions is described below. The
|
||||
|
@ -752,11 +752,11 @@
|
|||
Circumflex need not be the first character of the pattern if
|
||||
a number of alternatives are involved, but it should be the
|
||||
first thing in each alternative in which it appears if the
|
||||
pattern is ever to match that branch. If all possible alter-
|
||||
natives start with a circumflex, that is, if the pattern is
|
||||
pattern is ever to match that branch. If all possible
|
||||
alternatives start with a circumflex, that is, if the pattern is
|
||||
constrained to match only at the start of the subject, it is
|
||||
said to be an "anchored" pattern. (There are also other con-
|
||||
structs that can cause a pattern to be anchored.)
|
||||
said to be an "anchored" pattern. (There are also other
|
||||
constructs that can cause a pattern to be anchored.)
|
||||
|
||||
A dollar character is an assertion which is &true; only if the
|
||||
current matching point is at the end of the subject string,
|
||||
|
@ -779,10 +779,10 @@
|
|||
before an internal "\n" character, respectively, in addition
|
||||
to matching at the start and end of the subject string. For
|
||||
example, the pattern /^abc$/ matches the subject string
|
||||
"def\nabc" in multiline mode, but not otherwise. Conse-
|
||||
quently, patterns that are anchored in single line mode
|
||||
because all branches start with "^" are not anchored in mul-
|
||||
tiline mode. The <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link> option is ignored if
|
||||
"def\nabc" in multiline mode, but not otherwise.
|
||||
Consequently, patterns that are anchored in single line mode
|
||||
because all branches start with "^" are not anchored in
|
||||
multiline mode. The <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link> option is ignored if
|
||||
<link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> is set.
|
||||
|
||||
Note that the sequences \A, \Z, and \z can be used to match
|
||||
|
@ -798,9 +798,9 @@
|
|||
Outside a character class, a dot in the pattern matches any
|
||||
one character in the subject, including a non-printing
|
||||
character, but not (by default) newline. If the <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>
|
||||
option is set, then dots match newlines as well. The han-
|
||||
dling of dot is entirely independent of the handling of cir-
|
||||
cumflex and dollar, the only relationship being that they
|
||||
option is set, then dots match newlines as well. The
|
||||
handling of dot is entirely independent of the handling of
|
||||
circumflex and dollar, the only relationship being that they
|
||||
both involve newline characters. Dot has no special meaning
|
||||
in a character class.
|
||||
</literallayout>
|
||||
|
@ -809,25 +809,25 @@
|
|||
<refsect2 id="regexp.reference.squarebrackets">
|
||||
<title>Square brackets</title>
|
||||
<literallayout>
|
||||
An opening square bracket introduces a character class, ter-
|
||||
minated by a closing square bracket. A closing square
|
||||
An opening square bracket introduces a character class,
|
||||
terminated by a closing square bracket. A closing square
|
||||
bracket on its own is not special. If a closing square
|
||||
bracket is required as a member of the class, it should be
|
||||
the first data character in the class (after an initial cir-
|
||||
cumflex, if present) or escaped with a backslash.
|
||||
the first data character in the class (after an initial
|
||||
circumflex, if present) or escaped with a backslash.
|
||||
|
||||
A character class matches a single character in the subject;
|
||||
the character must be in the set of characters defined by
|
||||
the class, unless the first character in the class is a cir-
|
||||
cumflex, in which case the subject character must not be in
|
||||
the class, unless the first character in the class is a
|
||||
circumflex, in which case the subject character must not be in
|
||||
the set defined by the class. If a circumflex is actually
|
||||
required as a member of the class, ensure it is not the
|
||||
first character, or escape it with a backslash.
|
||||
|
||||
For example, the character class [aeiou] matches any lower
|
||||
case vowel, while [^aeiou] matches any character that is not
|
||||
a lower case vowel. Note that a circumflex is just a con-
|
||||
venient notation for specifying the characters which are in
|
||||
a lower case vowel. Note that a circumflex is just a
|
||||
convenient notation for specifying the characters which are in
|
||||
the class by enumerating those that are not. It is not an
|
||||
assertion: it still consumes a character from the subject
|
||||
string, and fails if the current pointer is at the end of
|
||||
|
@ -836,8 +836,8 @@
|
|||
When caseless matching is set, any letters in a class
|
||||
represent both their upper case and lower case versions, so
|
||||
for example, a caseless [aeiou] matches "A" as well as "a",
|
||||
and a caseless [^aeiou] does not match "A", whereas a case-
|
||||
ful version would.
|
||||
and a caseless [^aeiou] does not match "A", whereas a
|
||||
caseful version would.
|
||||
|
||||
The newline character is never treated in any special way in
|
||||
character classes, whatever the setting of the <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>
|
||||
|
@ -848,17 +848,17 @@
|
|||
of characters in a character class. For example, [d-m]
|
||||
matches any letter between d and m, inclusive. If a minus
|
||||
character is required in a class, it must be escaped with a
|
||||
backslash or appear in a position where it cannot be inter-
|
||||
preted as indicating a range, typically as the first or last
|
||||
backslash or appear in a position where it cannot be
|
||||
interpreted as indicating a range, typically as the first or last
|
||||
character in the class.
|
||||
|
||||
It is not possible to have the literal character "]" as the
|
||||
end character of a range. A pattern such as [W-]46] is
|
||||
interpreted as a class of two characters ("W" and "-") fol-
|
||||
lowed by a literal string "46]", so it would match "W46]" or
|
||||
interpreted as a class of two characters ("W" and "-")
|
||||
followed by a literal string "46]", so it would match "W46]" or
|
||||
"-46]". However, if the "]" is escaped with a backslash it
|
||||
is interpreted as the end of range, so [W-\]46] is inter-
|
||||
preted as a single class containing a range followed by two
|
||||
is interpreted as the end of range, so [W-\]46] is
|
||||
interpreted as a single class containing a range followed by two
|
||||
separate characters. The octal or hexadecimal representation
|
||||
of "]" can also be used to end a range.
|
||||
|
||||
|
@ -875,8 +875,8 @@
|
|||
appear in a character class, and add the characters that
|
||||
they match to the class. For example, [\dABCDEF] matches any
|
||||
hexadecimal digit. A circumflex can conveniently be used
|
||||
with the upper case character types to specify a more res-
|
||||
tricted set of characters than the matching lower case type.
|
||||
with the upper case character types to specify a more
|
||||
restricted set of characters than the matching lower case type.
|
||||
For example, the class [^\W_] matches any letter or digit,
|
||||
but not underscore.
|
||||
|
||||
|
@ -984,8 +984,8 @@
|
|||
which can be nested. Marking part of a pattern as a subpattern
|
||||
does two things:
|
||||
|
||||
1. It localizes a set of alternatives. For example, the pat-
|
||||
tern
|
||||
1. It localizes a set of alternatives. For example, the
|
||||
pattern
|
||||
|
||||
cat(aract|erpillar|)
|
||||
|
||||
|
@ -1131,8 +1131,8 @@
|
|||
|
||||
does the right thing with the C comments. The meaning of the
|
||||
various quantifiers is not otherwise changed, just the preferred
|
||||
number of matches. Do not confuse this use of ques-
|
||||
tion mark with its use as a quantifier in its own right.
|
||||
number of matches. Do not confuse this use of
|
||||
question mark with its use as a quantifier in its own right.
|
||||
Because it has two uses, it can sometimes appear doubled, as
|
||||
in
|
||||
|
||||
|
@ -1374,8 +1374,8 @@
|
|||
<title>Once-only subpatterns</title>
|
||||
<literallayout>
|
||||
With both maximizing and minimizing repetition, failure of
|
||||
what follows normally causes the repeated item to be re-
|
||||
evaluated to see if a different number of repeats allows the
|
||||
what follows normally causes the repeated item to be
|
||||
re-evaluated to see if a different number of repeats allows the
|
||||
rest of the pattern to match. Sometimes it is useful to
|
||||
prevent this, either to change the nature of the match, or
|
||||
to cause it fail earlier than it otherwise might, when the
|
||||
|
@ -1401,8 +1401,8 @@
|
|||
|
||||
This kind of parenthesis "locks up" the part of the pattern
|
||||
it contains once it has matched, and a failure further into
|
||||
the pattern is prevented from backtracking into it. Back-
|
||||
tracking past it to previous items, however, works as normal.
|
||||
the pattern is prevented from backtracking into it.
|
||||
Backtracking past it to previous items, however, works as normal.
|
||||
|
||||
An alternative description is that a subpattern of this type
|
||||
matches the string of characters that an identical standalone
|
||||
|
@ -1419,8 +1419,8 @@
|
|||
This construction can of course contain arbitrarily complicated
|
||||
subpatterns, and it can be nested.
|
||||
|
||||
Once-only subpatterns can be used in conjunction with look-
|
||||
behind assertions to specify efficient matching at the end
|
||||
Once-only subpatterns can be used in conjunction with
|
||||
look-behind assertions to specify efficient matching at the end
|
||||
of the subject string. Consider a simple pattern such as
|
||||
|
||||
abcd$
|
||||
|
@ -1547,8 +1547,8 @@
|
|||
comment play no part in the pattern matching at all.
|
||||
|
||||
If the <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option is set, an unescaped # character
|
||||
outside a character class introduces a comment that contin-
|
||||
ues up to the next newline character in the pattern.
|
||||
outside a character class introduces a comment that
|
||||
continues up to the next newline character in the pattern.
|
||||
</literallayout>
|
||||
</refsect2>
|
||||
|
||||
|
@ -1571,8 +1571,8 @@
|
|||
\( ( (?>[^()]+) | (?R) )* \)
|
||||
|
||||
First it matches an opening parenthesis. Then it matches any
|
||||
number of substrings which can either be a sequence of non-
|
||||
parentheses, or a recursive match of the pattern itself
|
||||
number of substrings which can either be a sequence of
|
||||
non-parentheses, or a recursive match of the pattern itself
|
||||
(i.e. a correctly parenthesized substring). Finally there is
|
||||
a closing parenthesis.
|
||||
|
||||
|
|
Loading…
Reference in a new issue