php-doc-en/reference/mbstring/encoding-requirements.xml
Daniel Egeberg 96c9d88bad Converted to utf-8
git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@297028 c90b9560-bf6c-de11-be94-00142212c4b1
2010-03-28 22:10:10 +00:00

110 lines
3 KiB
XML

<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision$ -->
<chapter xml:id="mbstring.php4.req" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<title>PHP Character Encoding Requirements</title>
<para>
Encodings of the following types are safely used with PHP.
<itemizedlist>
<listitem>
<para>
A singlebyte encoding,
<itemizedlist>
<listitem>
<simpara>
which has ASCII-compatible (ISO646 compatible) mappings for the
characters in range of <literal>00h</literal> to
<literal>7fh</literal>.
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
<listitem>
<para>
A multibyte encoding,
<itemizedlist>
<listitem>
<simpara>
which has ASCII-compatible mappings for the characters in range of
<literal>00h</literal> to <literal>7fh</literal>.
</simpara>
</listitem>
<listitem>
<simpara>
which don't use ISO2022 escape sequences.
</simpara>
</listitem>
<listitem>
<simpara>
which don't use a value from <literal>00h</literal> to
<literal>7fh</literal> in any of the compounded bytes
that represents a single character.
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
</itemizedlist>
</para>
<para>
These are examples of character encodings that are unlikely to work
with PHP.
<informalexample>
<programlisting>
<![CDATA[
JIS, SJIS, ISO-2022-JP, BIG-5
]]>
</programlisting>
</informalexample>
</para>
<para>
Although PHP scripts written in any of those encodings might not work,
especially in the case where encoded strings appear as identifiers
or literals in the script, you can almost avoid using these encodings
by setting up the <literal>mbstring</literal>'s transparent encoding
filter function for incoming HTTP queries.
</para>
<note>
<para>
It's highly discouraged to use SJIS, BIG5, CP936, CP949 and GB18030 for
the internal encoding unless you are familiar with the parser, the
scanner and the character encoding.
</para>
</note>
<note>
<para>
If you are connecting to a database with PHP, it is recommended that
you use the same character encoding for both the database and the
<literal>internal encoding</literal> for ease of use and better
performance.
</para>
<para>
If you are using PostgreSQL, the character encoding used in the
database and the one used in PHP may differ as it supports
automatic character set conversion between the backend and the frontend.
</para>
</note>
</chapter>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->