php-doc-en/reference/pcre/functions/preg-replace.xml
2003-12-21 01:05:43 +00:00

286 lines
9.3 KiB
XML

<?xml version="1.0" encoding="iso-8859-1"?>
<!-- $Revision: 1.12 $ -->
<!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
<refentry id="function.preg-replace">
<refnamediv>
<refname>preg_replace</refname>
<refpurpose>Perform a regular expression search and replace</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<methodsynopsis>
<type>mixed</type><methodname>preg_replace</methodname>
<methodparam><type>mixed</type><parameter>pattern</parameter></methodparam>
<methodparam><type>mixed</type><parameter>replacement</parameter></methodparam>
<methodparam><type>mixed</type><parameter>subject</parameter></methodparam>
<methodparam choice="opt"><type>int</type><parameter>limit</parameter></methodparam>
</methodsynopsis>
<para>
Searches <parameter>subject</parameter> for matches to
<parameter>pattern</parameter> and replaces them with
<parameter>replacement</parameter>. If
<parameter>limit</parameter> is specified, then only
<parameter>limit</parameter> matches will be replaced; if
<parameter>limit</parameter> is omitted or is -1, then all
matches are replaced.
</para>
<para>
<parameter>Replacement</parameter> may contain references of the form
<literal>\\<replaceable>n</replaceable></literal> or (since PHP 4.0.4)
<literal><replaceable>$n</replaceable></literal>, with the latter form
being the preferred one. Every such reference will be replaced by the text
captured by the <replaceable>n</replaceable>'th parenthesized pattern.
<replaceable>n </replaceable>can be from 0 to 99, and
<literal>\\0</literal> or <literal>$0</literal> refers to the text matched
by the whole pattern. Opening parentheses are counted from left to right
(starting from 1) to obtain the number of the capturing subpattern.
</para>
<para>
When working with a replacement pattern where a backreference is immediately
followed by another number (i.e.: placing a literal number immediately
after a matched pattern), you cannot use the familiar <literal>\\1</literal>
notation for your backreference. <literal>\\11</literal>, for example,
would confuse <function>preg_replace</function> since it does not know whether
you want the <literal>\\1</literal> backreference followed by a literal <literal>1</literal>,
or the <literal>\\11</literal> backreference followed by nothing. In this case
the solution is to use <literal>\${1}1</literal>. This creates an
isolated <literal>$1</literal> backreference, leaving the <literal>1</literal>
as a literal.
</para>
<para>
<example>
<title>Using backreferences followed by numeric literals</title>
<programlisting role="php">
<![CDATA[
<?php
$string = "April 15, 2003";
$pattern = "/(\w+) (\d+), (\d+)/i";
$replacement = "\${1}1,\$3";
echo preg_replace($pattern, $replacement, $string);
?>
]]>
</programlisting>
<para>
This example will output :
</para>
<screen>
<![CDATA[
April1,2003
]]>
</screen>
</example>
</para>
<para>
If matches are found, the new <parameter>subject</parameter> will
be returned, otherwise <parameter>subject</parameter> will be
returned unchanged.
</para>
<para>
Every parameter to <function>preg_replace</function> (except
<parameter>limit</parameter>) can be an unidimensional array.
When using arrays with <parameter>pattern</parameter> and
<parameter>replacement</parameter>, the keys are processed in the order
they appear in the array. This is <emphasis>not necessarily</emphasis>
the same as the numerical index order. If you use indexes to identify
which <parameter>pattern</parameter> should be replaced by which
<parameter>replacement</parameter>, you should perform a
<function>ksort</function> on each array prior to calling
<function>preg_replace</function>.
</para>
<para>
<example>
<title>Using indexed arrays with <function>preg_replace</function></title>
<programlisting role="php">
<![CDATA[
<?php
$string = "The quick brown fox jumped over the lazy dog.";
$patterns[0] = "/quick/";
$patterns[1] = "/brown/";
$patterns[2] = "/fox/";
$replacements[2] = "bear";
$replacements[1] = "black";
$replacements[0] = "slow";
echo preg_replace($patterns, $replacements, $string);
?>
]]>
</programlisting>
<para>
Output:
</para>
<screen>
<![CDATA[
The bear black slow jumped over the lazy dog.
]]>
</screen>
<para>
By ksorting patterns and replacements, we should get what we wanted.
</para>
<programlisting role="php">
<![CDATA[
<?php
ksort($patterns);
ksort($replacements);
echo preg_replace($patterns, $replacements, $string);
?>
]]>
</programlisting>
<para>
Output :
</para>
<screen>
<![CDATA[
The slow black bear jumped over the lazy dog.
]]>
</screen>
</example>
</para>
<para>
If <parameter>subject</parameter> is an array, then the search
and replace is performed on every entry of
<parameter>subject</parameter>, and the return value is an array
as well.
</para>
<para>
If <parameter>pattern</parameter> and <parameter>replacement</parameter>
are arrays, then <function>preg_replace</function> takes a value from
each array and uses them to do search and replace on
<parameter>subject</parameter>. If <parameter>replacement</parameter>
has fewer values than <parameter>pattern</parameter>, then empty string
is used for the rest of replacement values. If <parameter>pattern
</parameter> is an array and <parameter>replacement</parameter> is a
string, then this replacement string is used for every value of
<parameter>pattern</parameter>. The converse would not make sense,
though.
</para>
<para>
<literal>/e</literal> modifier makes <function>preg_replace</function>
treat the <parameter>replacement</parameter> parameter as PHP code after
the appropriate references substitution is done. Tip: make sure that
<parameter>replacement</parameter> constitutes a valid PHP code string,
otherwise PHP will complain about a parse error at the line containing
<function>preg_replace</function>.
</para>
<para>
<example>
<title>Replacing several values</title>
<programlisting role="php">
<![CDATA[
<?php
$patterns = array ("/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/",
"/^\s*{(\w+)}\s*=/");
$replace = array ("\\3/\\4/\\1\\2", "$\\1 =");
echo preg_replace($patterns, $replace, "{startDate} = 1999-5-27");
?>
]]>
</programlisting>
<para>
This example will produce:
</para>
<screen>
<![CDATA[
$startDate = 5/27/1999
]]>
</screen>
</example>
</para>
<para>
<example>
<title>Using /e modifier</title>
<programlisting role="php">
<![CDATA[
<?php
preg_replace("/(<\/?)(\w+)([^>]*>)/e",
"'\\1'.strtoupper('\\2').'\\3'",
$html_body);
?>
]]>
</programlisting>
<para>
This would capitalize all HTML tags in the input text.
</para>
</example>
</para>
<para>
<example>
<title>Convert HTML to text</title>
<programlisting role="php">
<![CDATA[
<?php
// $document should contain an HTML document.
// This will remove HTML tags, javascript sections
// and white space. It will also convert some
// common HTML entities to their text equivalent.
$search = array ("'<script[^>]*?>.*?</script>'si", // Strip out javascript
"'<[\/\!]*?[^<>]*?>'si", // Strip out HTML tags
"'([\r\n])[\s]+'", // Strip out white space
"'&(quot|#34);'i", // Replace HTML entities
"'&(amp|#38);'i",
"'&(lt|#60);'i",
"'&(gt|#62);'i",
"'&(nbsp|#160);'i",
"'&(iexcl|#161);'i",
"'&(cent|#162);'i",
"'&(pound|#163);'i",
"'&(copy|#169);'i",
"'&#(\d+);'e"); // evaluate as php
$replace = array ("",
"",
"\\1",
"\"",
"&",
"<",
">",
" ",
chr(161),
chr(162),
chr(163),
chr(169),
"chr(\\1)");
$text = preg_replace($search, $replace, $document);
?>
]]>
</programlisting>
</example>
</para>
<note>
<para>
Parameter <parameter>limit</parameter> was added after PHP 4.0.1pl2.
</para>
</note>
<para>
See also <function>preg_match</function>,
<function>preg_match_all</function>, and
<function>preg_split</function>.
</para>
</refsect1>
</refentry>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"../../../../manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->