From 73dbf9a98fd344a5196ae2664e6cc32ce9177b83 Mon Sep 17 00:00:00 2001 From: Mehdi Achour Date: Mon, 1 Dec 2003 23:44:09 +0000 Subject: [PATCH] There were forgotten division-marks inside words (Jakub Vrana) git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@145358 c90b9560-bf6c-de11-be94-00142212c4b1 --- .../pcre/functions/pcre.pattern.syntax.xml | 158 +++++++++--------- 1 file changed, 79 insertions(+), 79 deletions(-) diff --git a/reference/pcre/functions/pcre.pattern.syntax.xml b/reference/pcre/functions/pcre.pattern.syntax.xml index 73cffdbf51..ccdc5e0398 100644 --- a/reference/pcre/functions/pcre.pattern.syntax.xml +++ b/reference/pcre/functions/pcre.pattern.syntax.xml @@ -1,5 +1,5 @@ - + @@ -47,10 +47,10 @@ - Capturing subpatterns that occur inside negative looka- - head assertions are counted, but their entries in the - offsets vector are never set. Perl sets its numerical vari- - ables from any such patterns that are matched before the + Capturing subpatterns that occur inside negative + lookahead assertions are counted, but their entries in the + offsets vector are never set. Perl sets its numerical + variables from any such patterns that are matched before the assertion fails to match something (thereby succeeding), but only if the negative lookahead assertion contains just one branch. @@ -68,8 +68,8 @@ The following Perl escape sequences are not supported: \l, \u, \L, \U, \E, \Q. In fact these are implemented by - Perl's general string-handling and are not part of its pat- - tern matching engine. + Perl's general string-handling and are not part of its + pattern matching engine. @@ -123,7 +123,7 @@ If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is not - set, the $ meta- character matches only at the very end of + set, the $ meta-character matches only at the very end of the string. @@ -135,8 +135,8 @@ - If PCRE_UNGREEDY is set, the greediness of the repeti- - tion quantifiers is inverted, that is, by default they are + If PCRE_UNGREEDY is set, the greediness of the + repetition quantifiers is inverted, that is, by default they are not greedy, but if followed by a question mark they are. @@ -152,8 +152,8 @@ Introduction - The syntax and semantics of the regular expressions sup- - ported by PCRE are described below. Regular expressions are + The syntax and semantics of the regular expressions + supported by PCRE are described below. Regular expressions are also described in the Perl documentation and in a number of other books, some of which have copious examples. Jeffrey Friedl's "Mastering Regular Expressions", published by @@ -162,8 +162,8 @@ A regular expression is a pattern that is matched against a subject string from left to right. Most characters stand for - themselves in a pattern, and match the corresponding charac- - ters in the subject. As a trivial example, the pattern + themselves in a pattern, and match the corresponding + characters in the subject. As a trivial example, the pattern The quick brown fox matches a portion of a subject string that is identical to itself. @@ -173,9 +173,9 @@ Meta-characters The power of regular expressions comes from the - ability to include alternatives and repetitions in the pat- - tern. These are encoded in the pattern by the use of meta- - characters, which do not stand for themselves but instead + ability to include alternatives and repetitions in the + pattern. These are encoded in the pattern by the use of + meta-characters, which do not stand for themselves but instead are interpreted in some special way. @@ -299,8 +299,8 @@ Part of a pattern that is in square brackets is called a - "character class". In a character class the only meta- - characters are: + "character class". In a character class the only + meta-characters are: \ @@ -350,23 +350,23 @@ For example, if you want to match a "*" character, you write - "\*" in the pattern. This applies whether or not the follow- - ing character would otherwise be interpreted as a meta- - character, so it is always safe to precede a non-alphanumeric - with "\" to specify that it stands for itself. In particu- - lar, if you want to match a backslash, you write "\\". + "\*" in the pattern. This applies whether or not the + following character would otherwise be interpreted as a + meta-character, so it is always safe to precede a non-alphanumeric + with "\" to specify that it stands for itself. In + particular, if you want to match a backslash, you write "\\". - If a pattern is compiled with the PCRE_EXTENDED option, whi- - tespace in the pattern (other than in a character class) and + If a pattern is compiled with the PCRE_EXTENDED option, + whitespace in the pattern (other than in a character class) and characters between a "#" outside a character class and the next newline character are ignored. An escaping backslash can be used to include a whitespace or "#" character as part of the pattern. - A second use of backslash provides a way of encoding non- - printing characters in patterns in a visible manner. There + A second use of backslash provides a way of encoding + non-printing characters in patterns in a visible manner. There is no restriction on the appearance of non-printing characters, apart from the binary zero that terminates a pattern, but when a pattern is being prepared by text editing, it is @@ -569,8 +569,8 @@ - Note that octal values of 100 or greater must not be intro- - duced by a leading zero, because no more than three octal + Note that octal values of 100 or greater must not be + introduced by a leading zero, because no more than three octal digits are ever read. @@ -581,8 +581,8 @@ class it has a different meaning (see below). - The third use of backslash is for specifying generic charac- - ter types: + The third use of backslash is for specifying generic + character types: @@ -647,8 +647,8 @@ Perl "word". The definition of letters and digits is controlled by PCRE's character tables, and may vary if locale-specific matching is taking place (see "Locale support" - above). For example, in the "fr" (French) locale, some char- - acter codes greater than 128 are used for accented letters, + above). For example, in the "fr" (French) locale, some + character codes greater than 128 are used for accented letters, and these are matched by \w. @@ -659,8 +659,8 @@ is no character to match. - The fourth use of backslash is for certain simple asser- - tions. An assertion specifies a condition that has to be met + The fourth use of backslash is for certain simple + assertions. An assertion specifies a condition that has to be met at a particular point in a match, without consuming any characters from the subject string. The use of subpatterns for more complicated assertions is described below. The @@ -752,11 +752,11 @@ Circumflex need not be the first character of the pattern if a number of alternatives are involved, but it should be the first thing in each alternative in which it appears if the - pattern is ever to match that branch. If all possible alter- - natives start with a circumflex, that is, if the pattern is + pattern is ever to match that branch. If all possible + alternatives start with a circumflex, that is, if the pattern is constrained to match only at the start of the subject, it is - said to be an "anchored" pattern. (There are also other con- - structs that can cause a pattern to be anchored.) + said to be an "anchored" pattern. (There are also other + constructs that can cause a pattern to be anchored.) A dollar character is an assertion which is &true; only if the current matching point is at the end of the subject string, @@ -779,10 +779,10 @@ before an internal "\n" character, respectively, in addition to matching at the start and end of the subject string. For example, the pattern /^abc$/ matches the subject string - "def\nabc" in multiline mode, but not otherwise. Conse- - quently, patterns that are anchored in single line mode - because all branches start with "^" are not anchored in mul- - tiline mode. The PCRE_DOLLAR_ENDONLY option is ignored if + "def\nabc" in multiline mode, but not otherwise. + Consequently, patterns that are anchored in single line mode + because all branches start with "^" are not anchored in + multiline mode. The PCRE_DOLLAR_ENDONLY option is ignored if PCRE_MULTILINE is set. Note that the sequences \A, \Z, and \z can be used to match @@ -798,9 +798,9 @@ Outside a character class, a dot in the pattern matches any one character in the subject, including a non-printing character, but not (by default) newline. If the PCRE_DOTALL - option is set, then dots match newlines as well. The han- - dling of dot is entirely independent of the handling of cir- - cumflex and dollar, the only relationship being that they + option is set, then dots match newlines as well. The + handling of dot is entirely independent of the handling of + circumflex and dollar, the only relationship being that they both involve newline characters. Dot has no special meaning in a character class. @@ -809,25 +809,25 @@ Square brackets - An opening square bracket introduces a character class, ter- - minated by a closing square bracket. A closing square + An opening square bracket introduces a character class, + terminated by a closing square bracket. A closing square bracket on its own is not special. If a closing square bracket is required as a member of the class, it should be - the first data character in the class (after an initial cir- - cumflex, if present) or escaped with a backslash. + the first data character in the class (after an initial + circumflex, if present) or escaped with a backslash. A character class matches a single character in the subject; the character must be in the set of characters defined by - the class, unless the first character in the class is a cir- - cumflex, in which case the subject character must not be in + the class, unless the first character in the class is a + circumflex, in which case the subject character must not be in the set defined by the class. If a circumflex is actually required as a member of the class, ensure it is not the first character, or escape it with a backslash. For example, the character class [aeiou] matches any lower case vowel, while [^aeiou] matches any character that is not - a lower case vowel. Note that a circumflex is just a con- - venient notation for specifying the characters which are in + a lower case vowel. Note that a circumflex is just a + convenient notation for specifying the characters which are in the class by enumerating those that are not. It is not an assertion: it still consumes a character from the subject string, and fails if the current pointer is at the end of @@ -836,8 +836,8 @@ When caseless matching is set, any letters in a class represent both their upper case and lower case versions, so for example, a caseless [aeiou] matches "A" as well as "a", - and a caseless [^aeiou] does not match "A", whereas a case- - ful version would. + and a caseless [^aeiou] does not match "A", whereas a + caseful version would. The newline character is never treated in any special way in character classes, whatever the setting of the PCRE_DOTALL @@ -848,17 +848,17 @@ of characters in a character class. For example, [d-m] matches any letter between d and m, inclusive. If a minus character is required in a class, it must be escaped with a - backslash or appear in a position where it cannot be inter- - preted as indicating a range, typically as the first or last + backslash or appear in a position where it cannot be + interpreted as indicating a range, typically as the first or last character in the class. It is not possible to have the literal character "]" as the end character of a range. A pattern such as [W-]46] is - interpreted as a class of two characters ("W" and "-") fol- - lowed by a literal string "46]", so it would match "W46]" or + interpreted as a class of two characters ("W" and "-") + followed by a literal string "46]", so it would match "W46]" or "-46]". However, if the "]" is escaped with a backslash it - is interpreted as the end of range, so [W-\]46] is inter- - preted as a single class containing a range followed by two + is interpreted as the end of range, so [W-\]46] is + interpreted as a single class containing a range followed by two separate characters. The octal or hexadecimal representation of "]" can also be used to end a range. @@ -875,8 +875,8 @@ appear in a character class, and add the characters that they match to the class. For example, [\dABCDEF] matches any hexadecimal digit. A circumflex can conveniently be used - with the upper case character types to specify a more res- - tricted set of characters than the matching lower case type. + with the upper case character types to specify a more + restricted set of characters than the matching lower case type. For example, the class [^\W_] matches any letter or digit, but not underscore. @@ -984,8 +984,8 @@ which can be nested. Marking part of a pattern as a subpattern does two things: - 1. It localizes a set of alternatives. For example, the pat- - tern + 1. It localizes a set of alternatives. For example, the + pattern cat(aract|erpillar|) @@ -1131,8 +1131,8 @@ does the right thing with the C comments. The meaning of the various quantifiers is not otherwise changed, just the preferred - number of matches. Do not confuse this use of ques- - tion mark with its use as a quantifier in its own right. + number of matches. Do not confuse this use of + question mark with its use as a quantifier in its own right. Because it has two uses, it can sometimes appear doubled, as in @@ -1374,8 +1374,8 @@ Once-only subpatterns With both maximizing and minimizing repetition, failure of - what follows normally causes the repeated item to be re- - evaluated to see if a different number of repeats allows the + what follows normally causes the repeated item to be + re-evaluated to see if a different number of repeats allows the rest of the pattern to match. Sometimes it is useful to prevent this, either to change the nature of the match, or to cause it fail earlier than it otherwise might, when the @@ -1401,8 +1401,8 @@ This kind of parenthesis "locks up" the part of the pattern it contains once it has matched, and a failure further into - the pattern is prevented from backtracking into it. Back- - tracking past it to previous items, however, works as normal. + the pattern is prevented from backtracking into it. + Backtracking past it to previous items, however, works as normal. An alternative description is that a subpattern of this type matches the string of characters that an identical standalone @@ -1419,8 +1419,8 @@ This construction can of course contain arbitrarily complicated subpatterns, and it can be nested. - Once-only subpatterns can be used in conjunction with look- - behind assertions to specify efficient matching at the end + Once-only subpatterns can be used in conjunction with + look-behind assertions to specify efficient matching at the end of the subject string. Consider a simple pattern such as abcd$ @@ -1547,8 +1547,8 @@ comment play no part in the pattern matching at all. If the PCRE_EXTENDED option is set, an unescaped # character - outside a character class introduces a comment that contin- - ues up to the next newline character in the pattern. + outside a character class introduces a comment that + continues up to the next newline character in the pattern. @@ -1571,8 +1571,8 @@ \( ( (?>[^()]+) | (?R) )* \) First it matches an opening parenthesis. Then it matches any - number of substrings which can either be a sequence of non- - parentheses, or a recursive match of the pattern itself + number of substrings which can either be a sequence of + non-parentheses, or a recursive match of the pattern itself (i.e. a correctly parenthesized substring). Finally there is a closing parenthesis.