<?xml version="1.0" encoding="utf-8"?> <!-- $Revision$ --> <article xml:id="xml.encoding" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"> <title>Character Encoding</title> <para> PHP's XML extension supports the <link xlink:href="&url.unicode;">Unicode</link> character set through different <glossterm>character encoding</glossterm>s. There are two types of character encodings, <glossterm>source encoding</glossterm> and <glossterm>target encoding</glossterm>. PHP's internal representation of the document is always encoded with <literal>UTF-8</literal>. </para> <para> Source encoding is done when an XML document is <link linkend="function.xml-parse">parsed</link>. Upon <link linkend="function.xml-parser-create">creating an XML parser</link>, a source encoding can be specified (this encoding can not be changed later in the XML parser's lifetime). The supported source encodings are <literal>ISO-8859-1</literal>, <literal>US-ASCII</literal> and <literal>UTF-8</literal>. The former two are single-byte encodings, which means that each character is represented by a single byte. <literal>UTF-8</literal> can encode characters composed by a variable number of bits (up to 21) in one to four bytes. The default source encoding used by PHP is <literal>ISO-8859-1</literal>. </para> <para> Target encoding is done when PHP passes data to XML handler functions. When an XML parser is created, the target encoding is set to the same as the source encoding, but this may be changed at any point. The target encoding will affect character data as well as tag names and processing instruction targets. </para> <para> If the XML parser encounters characters outside the range that its source encoding is capable of representing, it will return an error. </para> <para> If PHP encounters characters in the parsed XML document that can not be represented in the chosen target encoding, the problem characters will be "demoted". Currently, this means that such characters are replaced by a question mark. </para> </article> <!-- Keep this comment at the end of the file Local variables: mode: sgml sgml-omittag:t sgml-shorttag:t sgml-minimize-attributes:nil sgml-always-quote-attributes:t sgml-indent-step:1 sgml-indent-data:t indent-tabs-mode:nil sgml-parent-document:nil sgml-default-dtd-file:"~/.phpdoc/manual.ced" sgml-exposed-tags:nil sgml-local-catalogs:nil sgml-local-ecat-files:nil End: vim600: syn=xml fen fdm=syntax fdl=2 si vim: et tw=78 syn=sgml vi: ts=1 sw=1 -->