php-doc-en/functions/regex.sgml
1999-06-21 22:12:43 +00:00

321 lines
10 KiB
Text

<reference id="ref.regex">
<title>Regular expression functions</title>
<titleabbrev>Regexps</titleabbrev>
<partintro>
<para>
Regular expressions are used for complex string manipulation in
PHP. The functions that support regular expressions are:
<itemizedlist>
<listitem>
<simpara><function>ereg</function>
</listitem>
<listitem>
<simpara><function>ereg_replace</function>
</listitem>
<listitem>
<simpara><function>eregi</function>
</listitem>
<listitem>
<simpara><function>eregi_replace</function>
</listitem>
<listitem>
<simpara><function>split</function>
</listitem>
</itemizedlist>
These functions all take a regular expression string as their
first argument. PHP uses the POSIX extended regular expressions
as defined by POSIX 1003.2. For a full description of POSIX
regular expressions see the regex man pages included in the regex
directory in the PHP distribution.
<!-- Should add discussion of PCRE functions here. -->
<para>
<example>
<title>Regular expression examples</title>
<programlisting>
ereg(&quot;abc&quot;,$string);
/* Returns true if &quot;abc&quot;
is found anywhere in $string. */
ereg(&quot;^abc&quot;,$string);
/* Returns true if &quot;abc&quot;
is found at the beginning of $string. */
ereg("abc$",$string);
/* Returns true if &quot;abc&quot;
is found at the end of $string. */
eregi("(ozilla.[23]|MSIE.3)",$HTTP_USER_AGENT);
/* Returns true if client browser
is Netscape 2, 3 or MSIE 3. */
ereg("([[:alnum:]]+) ([[:alnum:]]+) ([[:alnum:]]+)",
$string,$regs);
/* Places three space separated words
into $regs[1], $regs[2] and $regs[3]. */
$string = ereg_replace("^","&lt;BR&gt;",$string);
/* Put a &lt;BR&gt; tag at the beginning of $string. */
$string = ereg_replace("$","&lt;BR&gt;",$string);
/* Put a &lt;BR&gt; tag at the end of $string. */
$string = ereg_replace("\n","",$string);
/* Get rid of any carriage return
characters in $string. */
</programlisting>
</example>
</partintro>
<refentry id="function.ereg">
<refnamediv>
<refname>ereg</refname>
<refpurpose>regular expression match</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<funcsynopsis>
<funcdef>int <function>ereg</function></funcdef>
<paramdef>string <parameter>pattern</parameter></paramdef>
<paramdef>string <parameter>string</parameter></paramdef>
<paramdef>array <parameter><optional>regs</optional></parameter></paramdef>
</funcsynopsis>
<simpara>
Searchs <parameter>string</parameter> for matches to the regular expression
given in <parameter>pattern</parameter>.
<simpara>
If matches are found for parenthesized substrings of
<parameter>pattern</parameter> and the function is called with
the third argument <parameter>regs</parameter>, the matches will
be stored in the elements of
<parameter>regs</parameter>. $regs[1] will contain the substring
which starts at the first left parenthesis; $regs[2] will contain the
substring starting at the second, and so on. $regs[0] will
contain a copy of <parameter>string</parameter>.
<para>
Searching is case sensitive.
<para>
Returns true if a match for pattern was found in string, or false
if no matches were found or an error occurred.
<para>
The following code snippet takes a date in ISO format (YYYY-MM-DD)
and prints it in DD.MM.YYYY format:
<example>
<title>ereg() example</title>
<programlisting>
if ( ereg( "([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})", $date, $regs ) ) {
echo "$regs[3].$regs[2].$regs[1]";
} else {
echo "Invalid date format: $date";
}
</programlisting></example>
<para>
See also <function>eregi</function>, <function>ereg_replace</function>, and <function>eregi_replace</function>.
</refsect1>
</refentry>
<refentry id="function.ereg-replace">
<refnamediv>
<refname>ereg_replace</refname>
<refpurpose>replace regular expression</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<funcsynopsis>
<funcdef>string <function>ereg_replace</function></funcdef>
<paramdef>string <parameter>pattern</parameter></paramdef>
<paramdef>string <parameter>replacement</parameter></paramdef>
<paramdef>string <parameter>string</parameter></paramdef>
</funcsynopsis>
<para>
This function scans <parameter>string</parameter> for matches to
<parameter>pattern</parameter>, then replaces the matched text
with <parameter>replacement</parameter>.
<para>
The modified string is returned. (Which may mean that the
original string is returned if there are no matches to be
replaced.)
<para>
If <parameter>pattern</parameter> contains parenthesized
substrings, <parameter>replacement</parameter> may contain
substrings of the form
<literal>\\<replaceable>digit</replaceable></literal>, which will
be replaced by the text matching the digit'th parenthesized
substring; <literal>\\0</literal> will produce the entire
contents of string. Up to nine substrings may be used.
Parentheses may be nested, in which case they are counted by the
opening parenthesis.
<para>
If no matches are found in <parameter>string</parameter>, then
<parameter>string</parameter> will be returned unchanged.
<para>
For example, the following code snippet
prints "This was a test" three times:
<example>
<title>ereg_replace() example</title>
<programlisting>
$string = "This is a test";
echo ereg_replace( " is", " was", $string );
echo ereg_replace( "( )is", "\\1was", $string );
echo ereg_replace( "(( )is)", "\\2was", $string );
</programlisting>
</example>
<para>
See also <function>ereg</function>, <function>eregi</function>,
and <function>eregi_replace</function>.
</refsect1>
</refentry>
<refentry id="function.eregi">
<refnamediv>
<refname>eregi</refname>
<refpurpose>case insensitive regular expression match</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<funcsynopsis>
<funcdef>int <function>eregi</function></funcdef>
<paramdef>string <parameter>pattern</parameter></paramdef>
<paramdef>string <parameter>string</parameter></paramdef>
<paramdef>array <parameter><optional>regs</optional></parameter></paramdef>
</funcsynopsis>
<para>
This function is identical to <function>ereg</function> save that this ignores
case distinction when matching alphabetic characters.
<para>
See also <function>ereg</function>, <function>ereg_replace</function>, and <function>eregi_replace</function>.
</refsect1>
</refentry>
<refentry id="function.eregi-replace">
<refnamediv>
<refname>eregi_replace</refname>
<refpurpose>replace regular expression case insensitive</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<funcsynopsis>
<funcdef>string <function>eregi_replace</function></funcdef>
<paramdef>string <parameter>pattern</parameter></paramdef>
<paramdef>string <parameter>replacement</parameter></paramdef>
<paramdef>string <parameter>string</parameter></paramdef>
</funcsynopsis>
<para>
This function is identical to <function>ereg_replace</function> save that
this ignores case distinction when matching alphabetic characters.
<para>
See also <function>ereg</function>, <function>eregi</function>, and <function>ereg_replace</function>.
</refsect1>
</refentry>
<refentry id="function.split">
<refnamediv>
<refname>split</refname>
<refpurpose>split string into array by regular expression</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<funcsynopsis>
<funcdef>array <function>split</function></funcdef>
<paramdef>string <parameter>pattern</parameter></paramdef>
<paramdef>string <parameter>string</parameter></paramdef>
<paramdef>int <parameter><optional>limit</optional></parameter></paramdef>
</funcsynopsis>
<para>
Returns an array of strings, each of which is a substring of
string formed by splitting it on boundaries formed
by <parameter>pattern</parameter>. If an error occurs, returns false.
<para>
To get the first five fields from a line from
<filename>/etc/passwd</filename>:
<example>
<title>split() example</title>
<programlisting>
$passwd_list = split( ":", $passwd_line, 5 );
</programlisting></example>
<para>
Note that <parameter>pattern</parameter> is case-sensitive.
<para>
See also: <function>explode</function> and <function>implode</function>.
</refsect1>
</refentry>
<refentry id="function.sql-regcase">
<refnamediv>
<refname>sql_regcase</refname>
<refpurpose>make regular expression for case insensitive match</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<funcsynopsis>
<funcdef>string <function>sql_regcase</function></funcdef>
<paramdef>string <parameter>string</parameter></paramdef>
</funcsynopsis>
<para>
Returns a valid regular expression which will match <parameter>string</parameter>,
ignoring case. This expression is <parameter>string</parameter> with each character
converted to a bracket expression; this bracket expression
contains that character's uppercase and lowercase form if
applicable, otherwise it contains the original character
twice.
<example>
<title>sql_regcase() example</title>
<programlisting>
echo sql_regcase( "Foo bar" );
</programlisting></example>
prints <screen>[Ff][Oo][Oo][ ][Bb][Aa][Rr]</screen>.
<para>
This can be used to achieve case insensitive pattern matching in
products which support only case sensitive regular expressions.
</refsect1>
</refentry>
</reference>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"../manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
-->