mirror of
https://github.com/sigmasternchen/php-doc-en
synced 2025-03-15 08:28:54 +00:00
Improve documentation of string encoding conversion functions
- Move utf8_encode and utf8_decode into the strings chapter, since they were moved out of the XML extension in 7.2 - Recommend mb_convert_encoding, iconv, and UConverter::transcode when mentioning encoding in passing - Document UConverter::transcode, based on examination of source and upstream ICU docs - Make the language used more consistent, e.g. "convert" rather than "encode"/"decode", "encoding" rather than "charset" Closes GH-1418.
This commit is contained in:
parent
8b0e03372d
commit
99d758bd25
11 changed files with 259 additions and 60 deletions
|
@ -1401,7 +1401,7 @@ it is inserted with (e.g.) <function xmlns="http://docbook.org/ns/docbook">DOMNo
|
|||
<emphasis>could</emphasis> be called statically, but would issue an <constant>E_DEPRECATED</constant> error.
|
||||
As of PHP 8.0.0 calling this method statically throws an <classname>Error</classname> exception</para>'>
|
||||
<!ENTITY dom.malformederror '<para xmlns="http://docbook.org/ns/docbook">While malformed HTML should load successfully, this function may generate <constant>E_WARNING</constant> errors when it encounters bad markup. <link linkend="function.libxml-use-internal-errors">libxml's error handling functions</link> may be used to handle these errors.</para>'>
|
||||
<!ENTITY dom.note.utf8 '<note xmlns="http://docbook.org/ns/docbook"><para>The DOM extension uses UTF-8 encoding. Use <function>utf8_encode</function> and <function>utf8_decode</function> to work with texts in ISO-8859-1 encoding or <link linkend="ref.iconv">iconv</link> for other encodings.</para></note>'>
|
||||
<!ENTITY dom.note.utf8 '<note xmlns="http://docbook.org/ns/docbook"><para>The DOM extension uses UTF-8 encoding. Use <function>mb_convert_encoding</function>, <methodname>UConverter::transcode</methodname>, or <function>iconv</function> to handle other encodings.</para></note>'>
|
||||
<!ENTITY dom.note.json '<note xmlns="http://docbook.org/ns/docbook"><para>When using <function>json_encode</function> on a <classname>DOMDocument</classname> object the result will be that of encoding an empty object.</para></note>'>
|
||||
|
||||
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
<refentry xml:id="function.iconv" xmlns="http://docbook.org/ns/docbook">
|
||||
<refnamediv>
|
||||
<refname>iconv</refname>
|
||||
<refpurpose>Convert string to requested character encoding</refpurpose>
|
||||
<refpurpose>Convert a string from one character encoding to another</refpurpose>
|
||||
</refnamediv>
|
||||
|
||||
<refsect1 role="description">
|
||||
|
@ -15,8 +15,7 @@
|
|||
<methodparam><type>string</type><parameter>string</parameter></methodparam>
|
||||
</methodsynopsis>
|
||||
<para>
|
||||
Performs a character set conversion on the string
|
||||
<parameter>string</parameter> from <parameter>from_encoding</parameter>
|
||||
Converts <parameter>string</parameter> from <parameter>from_encoding</parameter>
|
||||
to <parameter>to_encoding</parameter>.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
@ -29,7 +28,7 @@
|
|||
<term><parameter>from_encoding</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The input charset.
|
||||
The current encoding used to interpret <parameter>string</parameter>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -37,14 +36,14 @@
|
|||
<term><parameter>to_encoding</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The output charset.
|
||||
The desired encoding of the result.
|
||||
</para>
|
||||
<para>
|
||||
If you append the string <literal>//TRANSLIT</literal> to
|
||||
<parameter>to_encoding</parameter> transliteration is activated. This
|
||||
If the string <literal>//TRANSLIT</literal> is appended to
|
||||
<parameter>to_encoding</parameter>, then transliteration is activated. This
|
||||
means that when a character can't be represented in the target charset,
|
||||
it can be approximated through one or several similarly looking
|
||||
characters. If you append the string <literal>//IGNORE</literal>,
|
||||
it may be approximated through one or several similarly looking
|
||||
characters. If the string <literal>//IGNORE</literal> is appended,
|
||||
characters that cannot be represented in the target charset are silently
|
||||
discarded. Otherwise, <constant>E_NOTICE</constant> is generated and the function
|
||||
will return &false;.
|
||||
|
@ -64,7 +63,7 @@
|
|||
<term><parameter>string</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The string to be converted.
|
||||
The &string; to be converted.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -75,10 +74,22 @@
|
|||
<refsect1 role="returnvalues">
|
||||
&reftitle.returnvalues;
|
||||
<para>
|
||||
Returns the converted string&return.falseforfailure;.
|
||||
Returns the converted string,&return.falseforfailure;.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="notes">
|
||||
&reftitle.notes;
|
||||
<note>
|
||||
<para>
|
||||
The character encodings and options available depend on the installed implementation
|
||||
of iconv. If the argument to <parameter>from_encoding</parameter>
|
||||
or <parameter>to_encoding</parameter> is not supported on the current system, &false;
|
||||
will be returned.
|
||||
</para>
|
||||
</note>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="examples">
|
||||
&reftitle.examples;
|
||||
<para>
|
||||
|
@ -111,7 +122,15 @@ Notice: iconv(): Detected an illegal character in input string in .\iconv-exampl
|
|||
</para>
|
||||
</refsect1>
|
||||
|
||||
|
||||
<refsect1 role="seealso">
|
||||
&reftitle.seealso;
|
||||
<para>
|
||||
<simplelist>
|
||||
<member><function>mb_convert_encoding</function></member>
|
||||
<member><methodname>UConverter::transcode</methodname></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
</refentry>
|
||||
<!-- Keep this comment at the end of the file
|
||||
|
|
|
@ -86,8 +86,6 @@ Array
|
|||
[19] => xml_parser_free
|
||||
[20] => xml_parser_set_option
|
||||
[21] => xml_parser_get_option
|
||||
[22] => utf8_encode
|
||||
[23] => utf8_decode
|
||||
)
|
||||
]]>
|
||||
</screen>
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
<refentry xml:id="uconverter.transcode" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||
<refnamediv>
|
||||
<refname>UConverter::transcode</refname>
|
||||
<refpurpose>Convert string from one charset to another</refpurpose>
|
||||
<refpurpose>Convert a string from one character encoding to another</refpurpose>
|
||||
</refnamediv>
|
||||
|
||||
<refsect1 role="description">
|
||||
|
@ -16,11 +16,8 @@
|
|||
<methodparam choice="opt"><type class="union"><type>array</type><type>null</type></type><parameter>options</parameter><initializer>&null;</initializer></methodparam>
|
||||
</methodsynopsis>
|
||||
<para>
|
||||
|
||||
Converts <parameter>str</parameter> from <parameter>fromEncoding</parameter> to <parameter>toEncoding</parameter>.
|
||||
</para>
|
||||
|
||||
&warn.undocumented.func;
|
||||
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="parameters">
|
||||
|
@ -30,7 +27,7 @@
|
|||
<term><parameter>str</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
|
||||
The &string; to be converted.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -38,7 +35,7 @@
|
|||
<term><parameter>toEncoding</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
|
||||
The desired encoding of the result.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -46,7 +43,7 @@
|
|||
<term><parameter>fromEncoding</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
|
||||
The current encoding used to interpret <parameter>str</parameter>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -54,7 +51,15 @@
|
|||
<term><parameter>options</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
|
||||
An optional &array;, which may contain the following keys:
|
||||
<simplelist>
|
||||
<member>
|
||||
<literal>'to_subst'</literal> - the substitution character to use
|
||||
in place of any character of <parameter>str</parameter> which cannot
|
||||
be encoded in <parameter>toEncoding</parameter>. If specified, it must
|
||||
represent a single character in the target encoding.
|
||||
</member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -64,10 +69,110 @@
|
|||
<refsect1 role="returnvalues">
|
||||
&reftitle.returnvalues;
|
||||
<para>
|
||||
|
||||
Returns the converted string&return.falseforfailure;.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="examples">
|
||||
&reftitle.examples;
|
||||
<example>
|
||||
<title>Converting from UTF-8 to UTF-16 and back</title>
|
||||
<programlisting role="php">
|
||||
<![CDATA[
|
||||
<?php
|
||||
$utf8_string = "\x5A\x6F\xC3\xAB"; // 'Zoë' in UTF-8
|
||||
$utf16_string = UConverter::transcode($utf8_string, 'UTF-16BE', 'UTF-8');
|
||||
echo bin2hex($utf16_string), "\n";
|
||||
|
||||
$new_utf8_string = UConverter::transcode($utf16_string, 'UTF-8', 'UTF-16BE');
|
||||
echo bin2hex($new_utf8_string), "\n";
|
||||
?>
|
||||
]]>
|
||||
</programlisting>
|
||||
&example.outputs;
|
||||
<screen>
|
||||
<![CDATA[
|
||||
005a006f00eb
|
||||
5a6fc3ab
|
||||
]]>
|
||||
</screen>
|
||||
</example>
|
||||
<example>
|
||||
<title>Invalid characters in input</title>
|
||||
<para>
|
||||
If the input string contains a sequence of bytes which is not valid in
|
||||
the encoding specified by <parameter>fromEncoding</parameter>, they are
|
||||
replaced by Unicode code point U+FFFD (Replacement Character) before
|
||||
converting to <parameter>toEncoding</parameter>.
|
||||
</para>
|
||||
<programlisting role="php">
|
||||
<![CDATA[
|
||||
<?php
|
||||
$invalid_utf8_string = "\xC3"; // incomplete multi-byte UTF-8 sequence
|
||||
$utf16_string = UConverter::transcode($invalid_utf8_string, 'UTF-16BE', 'UTF-8');
|
||||
echo bin2hex($utf16_string), "\n";
|
||||
?>
|
||||
]]>
|
||||
</programlisting>
|
||||
&example.outputs;
|
||||
<screen>
|
||||
<![CDATA[
|
||||
fffd
|
||||
]]>
|
||||
</screen>
|
||||
</example>
|
||||
<example>
|
||||
<title>Characters which cannot be encoded</title>
|
||||
<para>
|
||||
If the input string contains characters which cannot be represented
|
||||
in <parameter>toEncoding</parameter>, they are replaced with a single
|
||||
character. The default character to use depends on the encoding, and
|
||||
can be controlled using the <literal>'to_subst'</literal> option.
|
||||
</para>
|
||||
<programlisting role="php">
|
||||
<![CDATA[
|
||||
<?php
|
||||
$utf8_string = "\xE2\x82\xAC"; // € (Euro Sign) does not exist in ISO 8859-1
|
||||
|
||||
// Default replacement in ISO 8859-1 is "\x1A" (Substitute)
|
||||
$iso8859_1_string = UConverter::transcode($utf8_string, 'ISO-8859-1', 'UTF-8');
|
||||
echo bin2hex($iso8859_1_string), "\n";
|
||||
|
||||
// Specify a replacement of '?' ("\x3F") instead
|
||||
$iso8859_1_string = UConverter::transcode(
|
||||
$utf8_string, 'ISO-8859-1', 'UTF-8', ['to_subst' => '?']
|
||||
);
|
||||
echo bin2hex($iso8859_1_string), "\n";
|
||||
|
||||
// Since ISO 8859-1 cannot map U+FFFD, invalid input is also replaced by to_subst
|
||||
$invalid_utf8_string = "\xC3"; // incomplete multi-byte UTF-8 sequence
|
||||
$iso8859_1_string = UConverter::transcode(
|
||||
$invalid_utf8_string, 'ISO-8859-1', 'UTF-8', ['to_subst' => '?']
|
||||
);
|
||||
echo bin2hex($iso8859_1_string), "\n";
|
||||
?>
|
||||
]]>
|
||||
</programlisting>
|
||||
&example.outputs;
|
||||
<screen>
|
||||
<![CDATA[
|
||||
1a
|
||||
3f
|
||||
3f
|
||||
]]>
|
||||
</screen>
|
||||
</example>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="seealso">
|
||||
&reftitle.seealso;
|
||||
<para>
|
||||
<simplelist>
|
||||
<member><function>mb_convert_encoding</function></member>
|
||||
<member><function>iconv</function></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
</refentry>
|
||||
<!-- Keep this comment at the end of the file
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
<refentry xml:id="function.mb-convert-encoding" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||
<refnamediv>
|
||||
<refname>mb_convert_encoding</refname>
|
||||
<refpurpose>Convert character encoding</refpurpose>
|
||||
<refpurpose>Convert a string from one character encoding to another</refpurpose>
|
||||
</refnamediv>
|
||||
|
||||
<refsect1 role="description">
|
||||
|
@ -15,9 +15,8 @@
|
|||
<methodparam choice="opt"><type class="union"><type>array</type><type>string</type><type>null</type></type><parameter>from_encoding</parameter><initializer>&null;</initializer></methodparam>
|
||||
</methodsynopsis>
|
||||
<para>
|
||||
Converts the character encoding of <parameter>string</parameter>
|
||||
to <parameter>to_encoding</parameter>
|
||||
from optionally <parameter>from_encoding</parameter>.
|
||||
Converts <parameter>string</parameter> from <parameter>from_encoding</parameter>,
|
||||
or the current internal encoding, to <parameter>to_encoding</parameter>.
|
||||
If <parameter>string</parameter> is an &array;, all its &string; values will be
|
||||
converted recursively.
|
||||
</para>
|
||||
|
@ -31,7 +30,7 @@
|
|||
<term><parameter>string</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The &string; or &array; being encoded.
|
||||
The &string; or &array; to be converted.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -39,7 +38,7 @@
|
|||
<term><parameter>to_encoding</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The type of encoding that <parameter>string</parameter> is being converted to.
|
||||
The desired encoding of the result.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -47,15 +46,20 @@
|
|||
<term><parameter>from_encoding</parameter></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Is specified by character code names before conversion. It is either
|
||||
an <type>array</type>, or a comma separated enumerated list.
|
||||
If <parameter>from_encoding</parameter> is not specified, the internal
|
||||
encoding will be used.
|
||||
<!-- link to internal encoding info -->
|
||||
The current encoding used to interpret <parameter>string</parameter>.
|
||||
Multiple encodings may be specified as an &array; or comma separated
|
||||
list, in which case the correct encoding will be guessed using the
|
||||
same algorithm as <function>mb_detect_encoding</function>.
|
||||
</para>
|
||||
<para>
|
||||
See <link linkend="mbstring.supported-encodings">supported
|
||||
encodings</link>.
|
||||
If <parameter>from_encoding</parameter> is &null; or not specified, the
|
||||
<link linkend="ini.mbstring.internal-encoding">mbstring.internal_encoding setting</link>
|
||||
will be used if set, otherwise the <link linkend="ini.default-charset">default_charset setting</link>.
|
||||
</para>
|
||||
<para>
|
||||
See <link linkend="mbstring.supported-encodings">supported encodings</link>
|
||||
for valid values of <parameter>to_encoding</parameter>
|
||||
and <parameter>from_encoding</parameter>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -142,7 +146,7 @@ $str = mb_convert_encoding($str, "UTF-7", "EUC-JP");
|
|||
/* Auto detect encoding from JIS, eucjp-win, sjis-win, then convert str to UCS-2LE */
|
||||
$str = mb_convert_encoding($str, "UCS-2LE", "JIS, eucjp-win, sjis-win");
|
||||
|
||||
/* "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" */
|
||||
/* If mbstring.language is "Japanese", "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" */
|
||||
$str = mb_convert_encoding($str, "EUC-JP", "auto");
|
||||
?>
|
||||
]]>
|
||||
|
@ -156,6 +160,8 @@ $str = mb_convert_encoding($str, "EUC-JP", "auto");
|
|||
<para>
|
||||
<simplelist>
|
||||
<member><function>mb_detect_order</function></member>
|
||||
<member><methodname>UConverter::transcode</methodname></member>
|
||||
<member><function>iconv</function></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</refsect1>
|
||||
|
|
|
@ -83,6 +83,9 @@ echo recode_string("us..flat", "The following character has a diacritical mark:
|
|||
The GNU Recode documentation of your installation for detailed
|
||||
instructions about recode requests.
|
||||
</member>
|
||||
<member><function>mb_convert_encoding</function></member>
|
||||
<member><methodname>UConverter::transcode</methodname></member>
|
||||
<member><function>iconv</function></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</refsect1>
|
||||
|
|
|
@ -4,8 +4,8 @@
|
|||
<refnamediv>
|
||||
<refname>utf8_decode</refname>
|
||||
<refpurpose>
|
||||
Converts a string with ISO-8859-1 characters encoded with UTF-8
|
||||
to single-byte ISO-8859-1
|
||||
Converts a string from UTF-8 to ISO-8859-1, replacing invalid or unrepresentable
|
||||
characters
|
||||
</refpurpose>
|
||||
</refnamediv>
|
||||
|
||||
|
@ -20,9 +20,10 @@
|
|||
<literal>UTF-8</literal> encoding to <literal>ISO-8859-1</literal>. Bytes
|
||||
in the string which are not valid <literal>UTF-8</literal>, and
|
||||
<literal>UTF-8</literal> characters which do not exist in
|
||||
<literal>ISO-8859-1</literal> (that is, characters above
|
||||
<literal>ISO-8859-1</literal> (that is, code points above
|
||||
<literal>U+00FF</literal>) are replaced with <literal>?</literal>.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Many web pages marked as using the <literal>ISO-8859-1</literal> character
|
||||
|
@ -62,6 +63,42 @@
|
|||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="examples">
|
||||
&reftitle.examples;
|
||||
<example>
|
||||
<title>Basic examples</title>
|
||||
<programlisting role="php">
|
||||
<![CDATA[
|
||||
<?php
|
||||
// Convert the string 'Zoë' from UTF-8 to ISO 8859-1
|
||||
$utf8_string = "\x5A\x6F\xC3\xAB";
|
||||
$iso8859_1_string = utf8_decode($utf8_string);
|
||||
echo bin2hex($iso8859_1_string), "\n";
|
||||
|
||||
// Invalid UTF-8 sequences are replaced with '?'
|
||||
$invalid_utf8_string = "\xC3";
|
||||
$iso8859_1_string = utf8_decode($invalid_utf8_string);
|
||||
var_dump($iso8859_1_string);
|
||||
|
||||
// Characters which don't exist in ISO 8859-1, such as
|
||||
// '€' (Euro Sign) are also replaced with '?'
|
||||
$utf8_string = "\xE2\x82\xAC";
|
||||
$iso8859_1_string = utf8_decode($utf8_string);
|
||||
var_dump($iso8859_1_string);
|
||||
?>
|
||||
]]>
|
||||
</programlisting>
|
||||
&example.outputs;
|
||||
<screen>
|
||||
<![CDATA[
|
||||
5a6feb
|
||||
string(1) "?"
|
||||
string(1) "?"
|
||||
]]>
|
||||
</screen>
|
||||
</example>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="changelog">
|
||||
&reftitle.changelog;
|
||||
<para>
|
||||
|
@ -77,8 +114,8 @@
|
|||
<row>
|
||||
<entry>7.2.0</entry>
|
||||
<entry>
|
||||
This function has been moved to the core of PHP, and therefore lifting the requirement
|
||||
on the XML extension for this function to be available.
|
||||
This function has been moved from the XML extension to the core of PHP.
|
||||
In previous versions, it was only available if the XML extension was installed.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
|
@ -91,10 +128,10 @@
|
|||
&reftitle.seealso;
|
||||
<para>
|
||||
<simplelist>
|
||||
<member><function>utf8_encode</function> - Performs the reverse conversion</member>
|
||||
<member><function>mb_convert_encoding</function> - Converts between various character encodings, including UTF-8, ISO-8859-1 and Windows-1252</member>
|
||||
<member><function>iconv</function> - Converts between various character encodings</member>
|
||||
<member><function>recode_string</function> - Converts between various character encodings</member>
|
||||
<member><function>utf8_encode</function></member>
|
||||
<member><function>mb_convert_encoding</function></member>
|
||||
<member><methodname>UConverter::transcode</methodname></member>
|
||||
<member><function>iconv</function></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</refsect1>
|
|
@ -3,7 +3,7 @@
|
|||
<refentry xmlns="http://docbook.org/ns/docbook" xml:id="function.utf8-encode">
|
||||
<refnamediv>
|
||||
<refname>utf8_encode</refname>
|
||||
<refpurpose>Encodes an ISO-8859-1 string to UTF-8</refpurpose>
|
||||
<refpurpose>Converts a string from ISO-8859-1 to UTF-8</refpurpose>
|
||||
</refnamediv>
|
||||
|
||||
<refsect1 role="description">
|
||||
|
@ -16,7 +16,15 @@
|
|||
This function converts the string <parameter>string</parameter> from the
|
||||
<literal>ISO-8859-1</literal> encoding to <literal>UTF-8</literal>.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
This function does not attempt to guess the current encoding of the provided
|
||||
string, it assumes it is encoded as ISO-8859-1 (also known as "Latin 1")
|
||||
and converts to UTF-8. Since every sequence of bytes is a valid ISO-8859-1
|
||||
string, this never results in an error, but will not result in a useful string
|
||||
if a different encoding was intended.
|
||||
</para>
|
||||
<para>
|
||||
Many web pages marked as using the <literal>ISO-8859-1</literal> character
|
||||
encoding actually use the similar <literal>Windows-1252</literal> encoding,
|
||||
|
@ -55,6 +63,29 @@
|
|||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="examples">
|
||||
&reftitle.examples;
|
||||
<example>
|
||||
<title>Basic example</title>
|
||||
<programlisting role="php">
|
||||
<![CDATA[
|
||||
<?php
|
||||
// Convert the string 'Zoë' from ISO 8859-1 to UTF-8
|
||||
$iso8859_1_string = "\x5A\x6F\xEB";
|
||||
$utf8_string = utf8_encode($iso8859_1_string);
|
||||
echo bin2hex($utf8_string), "\n";
|
||||
?>
|
||||
]]>
|
||||
</programlisting>
|
||||
&example.outputs;
|
||||
<screen>
|
||||
<![CDATA[
|
||||
5a6fc3ab
|
||||
]]>
|
||||
</screen>
|
||||
</example>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 role="changelog">
|
||||
&reftitle.changelog;
|
||||
<para>
|
||||
|
@ -70,8 +101,8 @@
|
|||
<row>
|
||||
<entry>7.2.0</entry>
|
||||
<entry>
|
||||
This function has been moved to the core of PHP, and therefore lifting the requirement
|
||||
on the XML extension for this function to be available.
|
||||
This function has been moved from the XML extension to the core of PHP.
|
||||
In previous versions, it was only available if the XML extension was installed.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
|
@ -84,10 +115,10 @@
|
|||
&reftitle.seealso;
|
||||
<para>
|
||||
<simplelist>
|
||||
<member><function>utf8_decode</function> - Performs the reverse conversion</member>
|
||||
<member><function>mb_convert_encoding</function> - Converts between various character encodings, including UTF-8, ISO-8859-1 and Windows-1252</member>
|
||||
<member><function>iconv</function> - Converts between various character encodings</member>
|
||||
<member><function>recode_string</function> - Converts between various character encodings</member>
|
||||
<member><function>utf8_decode</function></member>
|
||||
<member><function>mb_convert_encoding</function></member>
|
||||
<member><methodname>UConverter::transcode</methodname></member>
|
||||
<member><function>iconv</function></member>
|
||||
</simplelist>
|
||||
</para>
|
||||
</refsect1>
|
|
@ -101,6 +101,8 @@
|
|||
<function name="trim" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="ucfirst" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="ucwords" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="utf8_decode" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="utf8_encode" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="vfprintf" from="PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="vprintf" from="PHP 4 >= 4.1.0, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="vsprintf" from="PHP 4 >= 4.1.0, PHP 5, PHP 7, PHP 8"/>
|
||||
|
|
|
@ -66,9 +66,9 @@ echo $packet;
|
|||
|
||||
<note>
|
||||
<para>
|
||||
If you want to serialize non-ASCII characters you have to convert
|
||||
your data to UTF-8 first (see <function>utf8_encode</function> and
|
||||
<function>iconv</function>).
|
||||
Strings should be encoded in UTF-8; to handle other encodings, convert
|
||||
the string first using <function>mb_convert_encoding</function>,
|
||||
<methodname>UConverter::transcode</methodname>, or <function>iconv</function>.
|
||||
</para>
|
||||
</note>
|
||||
</section>
|
||||
|
|
|
@ -4,8 +4,6 @@
|
|||
Do NOT translate this file
|
||||
-->
|
||||
<versions>
|
||||
<function name="utf8_decode" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="utf8_encode" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="xml_error_string" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="xml_get_current_byte_index" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
<function name="xml_get_current_column_number" from="PHP 4, PHP 5, PHP 7, PHP 8"/>
|
||||
|
|
Loading…
Reference in a new issue