diff --git a/reference/mbstring/encodings.xml b/reference/mbstring/encodings.xml
new file mode 100644
index 0000000000..4786b26678
--- /dev/null
+++ b/reference/mbstring/encodings.xml
@@ -0,0 +1,879 @@
+
+
+
+ Summaries of supported encodings
+
+ UCS-4
+ Name in the IANA character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ ISO-10646-UCS-4
+ ISO 10646
+
+ The Universal Character Set with 31-bit code space, standardized as UCS-4
+ by ISO/IEC 10646. It is kept synchronized with the latest version of the
+ Unicode code map.
+
+
+ If this name is used in the encoding conversion facility,
+ the converter attempts to identify by the preceding BOM
+ (byte order mark)in which endian the subsequent bytes
+ are represented.
+
+
+
+
+ UCS-4BE
+ Name in the IANA character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ ISO-10646-UCS-4
+ UCS-4
+
+ See above.
+
+
+ In contrast to UCS-4, strings are always assumed
+ to be in big endian form.
+
+
+
+
+ UCS-4LE
+ Name in the IANA character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ ISO-10646-UCS-4
+ UCS-4
+
+ See above.
+
+
+ In contrast to UCS-4, strings are always assumed
+ to be in little endian form.
+
+
+
+
+ UCS-2
+ Name in the IANA character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ ISO-10646-UCS-2
+ UCS-2
+
+ The Universal Character Set with 16-bit code space, standardized as UCS-2
+ by ISO/IEC 10646. It is kept synchronized with the latest version of the
+ unicode code map.
+
+
+ If this name is used in the encoding conversion facility,
+ the converter attempts to identify by the preceding BOM
+ (byte order mark)in which endian the subsequent bytes
+ are represented.
+
+
+
+
+ UCS-2BE
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ ISO-10646-UCS-2
+ UCS-2
+
+ See above.
+
+
+ In contrast to UCS-2, strings are always assumed
+ to be in big endian form.
+
+
+
+
+ UCS-2LE
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Ddditional note
+
+ ISO-10646-UCS-2
+ UCS-2
+
+ See above.
+
+
+ In contrast to UCS-2, strings are always assumed
+ to be in little endian form.
+
+
+
+
+ UTF-32
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-32
+ Unicode
+
+ Unicode Transformation Format of 32-bit unit width, whose encoding space
+ refers to the Unicode's codeset standard. This encoding scheme wasn't
+ identical to UCS-4 because the code space of Unicode were limited to
+ a 21-bit value.
+
+
+ If this name is used in the encoding conversion facility,
+ the converter attempts to identify by the preceding BOM
+ (byte order mark)in which endian the subsequent bytes
+ are represented.
+
+
+
+
+ UTF-32BE
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-32BE
+ Unicode
+ See above
+
+ In contrast to UTF-32, strings are always assumed
+ to be in big endian form.
+
+
+
+
+ UTF-32LE
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-32LE
+ Unicode
+ See above
+
+ In contrast to UTF-32, strings are always assumed
+ to be in little endian form.
+
+
+
+
+ UTF-16
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-16
+ Unicode
+
+ Unicode Transformation Format of 16-bit unit width. It's worth a note
+ that UTF-16 is no longer the same specification as UCS-2 because the
+ surrogate mechanism has been introduced since Unicode 2.0 and
+ UTF-16 now refers to a 21-bit code space.
+
+
+ If this name is used in the encoding conversion facility,
+ the converter attempts to identify by the preceding BOM
+ (byte order mark)in which endian the subsequent bytes
+ are represented.
+
+
+
+
+ UTF-16BE
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-16BE
+ Unicode
+
+ See above.
+
+
+ In contrast to UTF-16, strings are always assumed
+ to be in big endian form.
+
+
+
+
+ UTF-16LE
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-16BE
+ Unicode
+
+ See above.
+
+
+ In contrast to UTF-16, strings are always assumed
+ to be in big endian form.
+
+
+
+
+ UTF-8
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-8
+ Unicode / UCS
+
+ Unicode Transformation Format of 8-bit unit width.
+
+ none
+
+
+
+ UTF-7
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ UTF-7
+ Unicode
+
+ A mail-safe transformation format of Unicode, specified in
+ RFC2152.
+
+ none
+
+
+
+ UTF7-IMAP
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ (none)
+ Unicode
+
+ A variant of UTF-7 which is specialized for use in the
+ IMAP protocol.
+
+ none
+
+
+
+ ASCII
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+ US-ASCII (preferred MIME name) / iso-ir-6 / ANSI_X3.4-1986 /
+ ISO_646.irv:1991 / ASCII / ISO646-US / us / IBM367 / CP367 / csASCII
+
+ ASCII / ISO 646
+
+ American Standard Code for Information Interchange is a commonly-used
+ 7-bit encoding. Also standardized as an international standard, ISO 646.
+
+ (none)
+
+
+
+ EUC-JP
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+ EUC-JP (preferred MIME name) /
+ Extended_UNIX_Code_Packed_Format_for_Japanese / csEUCPkdFmtJapanese
+
+
+ Compound of US-ASCII / JIS X0201:1997 (hankaku kana part) /
+ JIS X0208:1990 / JIS X0212:1990
+
+
+ As you see the name is derived from an abbreviation of Extended UNIX Code
+ Packed Format for Japanese, this encoding is mostly used on UNIX or
+ alike platforms. The original encoding scheme, Extended UNIX Code, is
+ designed on the basis of ISO 2022.
+
+
+ The character set referred to by EUC-JP is different to IBM932 / CP932,
+ which are used by OS/2® and Microsoft® Windows®.
+ For information interchange with those platforms, use EUCJP-WIN instead.
+
+
+
+
+ SJIS
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ Shift_JIS (preferred MIME name) / MS_Kanji / csShift_JIS
+ Compound of JIS X0201:1997 / JIS X0208:1997
+
+ Shift_JIS was developed in early 80's, at the time personal Japanese word
+ processors were brought into the market, in order to maintain
+ compatiblities with the legacy encoding scheme JIS X 0201:1976.
+ According to the IANA definition the codeset of Shift_JIS is slightly
+ different to IBM932 / CP932. However, the names "SJIS" / "Shift_JIS" are
+ often wrongly used to refer to these codesets.
+
+ For the CP932 codemap, use SJIS-WIN instead.
+
+
+
+ EUCJP-WIN
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ (none)
+
+ Compound of JIS X0201:1997 / JIS X0208:1997 / IBM extensions / NEC extensions
+
+
+ While this "encoding" uses the same encoding scheme as EUC-JP,
+ the underlying character set is different. That is, some code points map
+ to different characters than EUC-JP.
+
+ none
+
+
+
+ SJIS-win
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ Windows-31J / csWindows31J
+
+ Compound of JIS X0201:1997 / JIS X0208:1997 / IBM extensions / NEC extensions
+
+
+ While this "encoding" uses the same encoding scheme as
+ Shift_JIS, the underlying character set is different. That means some code
+ points map to different characters than Shift_JIS.
+
+ (none)
+
+
+
+ ISO-2022-JP
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+ ISO-2022-JP (preferred MIME name) / csISO2022JP
+
+ US-ASCII / JIS X0201:1976 / JIS X0208:1978 / JIS X0208:1983
+
+ RFC1468
+ (none)
+
+
+
+ JIS
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-1
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-2
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-3
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-4
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-5
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-6
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-7
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-8
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-9
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-10
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-13
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-14
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-8859-15
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ byte2be
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ byte2le
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ byte4be
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ byte4le
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ BASE64
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ HTML-ENTITIES
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ 7bit
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ 8bit
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ EUC-CN
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ CP936
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ HZ
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ EUC-TW
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ CP950
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ BIG-5
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ EUC-KR
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ UHC (CP949)
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ ISO-2022-KR
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ Windows-1251 (CP1251)
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ Windows-1252 (CP1252)
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ CP866 (IBM866)
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+ KOI8-R
+ Name in the iana character set registry
+ Underlying character set
+ Description
+ Additional note
+
+
+
+
+
+
+
+
+
+
diff --git a/reference/mbstring/reference.xml b/reference/mbstring/reference.xml
index 5f4357344b..5b929ac34d 100644
--- a/reference/mbstring/reference.xml
+++ b/reference/mbstring/reference.xml
@@ -1,8 +1,8 @@
-
+
- Multi-Byte String Functions
- Multi-Byte String
+ Multibyte String Functions
+ Multibyte String
@@ -110,7 +110,6 @@ JIS, SJIS, ISO-2022-JP, BIG-5
scanner and the character encoding.
-
If you have some database connected with PHP, it is recommended that
@@ -148,13 +147,13 @@ JIS, SJIS, ISO-2022-JP, BIG-5
- In PHP 4.3.2 or earlier versions, mbstring
- there is a limitation in this functionality that
- mbstring does not perform character encoding
- conversion in POST data if the enctype attribute in
- the form element is set to
- multipart/form-data. So you have to convert
- the incoming data by yourself in this case if necessary.
+ In PHP 4.3.2 or earlier versions, there was a limitation in this
+ functionality that mbstring does not perform
+ character encoding conversion in POST data if the
+ enctype attribute in the form
+ element is set to multipart/form-data.
+ So you have to convert the incoming data by yourself in this case
+ if necessary.
Beginning with PHP 4.3.3, if enctype for HTML form is
@@ -257,300 +256,306 @@ ob_start('mb_output_handler');
-
- Supported Character Encodings
-
- Currently the following character encodings are supported by the
- mbstring module. Any of those Character encodings
- can be specified in the encoding parameter of
- mbstring functions.
-
-
- The following character encoding is supported in this PHP
- extension:
-
-
- UCS-4
- UCS-4BE
- UCS-4LE
- UCS-2
- UCS-2BE
- UCS-2LE
- UTF-32
- UTF-32BE
- UTF-32LE
- UTF-16
- UTF-16BE
- UTF-16LE
- UTF-7
- UTF7-IMAP
- UTF-8
- ASCII
- EUC-JP
- SJIS
- eucJP-win
- SJIS-win
- ISO-2022-JP
- JIS
- ISO-8859-1
- ISO-8859-2
- ISO-8859-3
- ISO-8859-4
- ISO-8859-5
- ISO-8859-6
- ISO-8859-7
- ISO-8859-8
- ISO-8859-9
- ISO-8859-10
- ISO-8859-13
- ISO-8859-14
- ISO-8859-15
- byte2be
- byte2le
- byte4be
- byte4le
- BASE64
- HTML-ENTITIES
- 7bit
- 8bit
- EUC-CN
- CP936
- HZ
- EUC-TW
- CP950
- BIG-5
- EUC-KR
- UHC (CP949)
- ISO-2022-KR
- Windows-1251 (CP1251)
- Windows-1252 (CP1252)
- CP866 (IBM866)
- KOI8-R
-
-
- &php.ini; entry, which accepts encoding name,
- accepts "auto" and
- "pass" also.
- mbstring functions, which accepts encoding
- name, and accepts "auto".
-
-
- If "pass" is set, no character
- encoding conversion is performed.
-
-
- If "auto" is set, it is expanded to
- the list of encodings defined per the NLS.
- For instance, if the NLS is set to Japanese,
- the value is assumed to be
- "ASCII,JIS,UTF-8,EUC-JP,SJIS".
-
-
- See also mb_detect_order
-
+
+ Supported Character Encodings
+
+ Currently the following character encodings are supported by the
+ mbstring module. Any of those Character encodings
+ can be specified in the encoding parameter of
+ mbstring functions.
+
+
+ The following character encoding is supported in this PHP
+ extension:
+
+
+ UCS-4
+ UCS-4BE
+ UCS-4LE
+ UCS-2
+ UCS-2BE
+ UCS-2LE
+ UTF-32
+ UTF-32BE
+ UTF-32LE
+ UTF-16
+ UTF-16BE
+ UTF-16LE
+ UTF-7
+ UTF7-IMAP
+ UTF-8
+ ASCII
+ EUC-JP
+ SJIS
+ eucJP-win
+ SJIS-win
+ ISO-2022-JP
+ JIS
+ ISO-8859-1
+ ISO-8859-2
+ ISO-8859-3
+ ISO-8859-4
+ ISO-8859-5
+ ISO-8859-6
+ ISO-8859-7
+ ISO-8859-8
+ ISO-8859-9
+ ISO-8859-10
+ ISO-8859-13
+ ISO-8859-14
+ ISO-8859-15
+ byte2be
+ byte2le
+ byte4be
+ byte4le
+ BASE64
+ HTML-ENTITIES
+ 7bit
+ 8bit
+ EUC-CN
+ CP936
+ HZ
+ EUC-TW
+ CP950
+ BIG-5
+ EUC-KR
+ UHC (CP949)
+ ISO-2022-KR
+ Windows-1251 (CP1251)
+ Windows-1252 (CP1252)
+ CP866 (IBM866)
+ KOI8-R
+
+
+ &php.ini; entry, which accepts encoding name,
+ accepts "auto" and
+ "pass" also.
+ mbstring functions, which accepts encoding
+ name, and accepts "auto".
+
+
+ If "pass" is set, no character
+ encoding conversion is performed.
+
+
+ If "auto" is set, it is expanded to
+ the list of encodings defined per the NLS.
+ For instance, if the NLS is set to Japanese,
+ the value is assumed to be
+ "ASCII,JIS,UTF-8,EUC-JP,SJIS".
+
+
+ See also mb_detect_order
+
-
- Function Overloading Feature
-
-
- You might often find it difficult to get an existing PHP application
- work in a given multibyte environment. That's mostly because lots of
- PHP applications out there are written with the standard
- string functions such as substr, which are
- known to not properly handle multibyte-encoded strings.
-
-
- mbstring supports 'function overloading' feature which enables
- you to add multibyte awareness to such an application without
- code modification by overloading multibyte counterparts on
- the standard string functions. For example,
- mb_substr is called instead of
- substr if function overloading is enabled.
- This feature makes it easy to port applications that only support
- single-byte encodings to a multibyte environment in many cases.
-
-
- To use the function overloading, set
- mbstring.func_overload in &php.ini; to a
- positive value that represents a combination of bitmasks specifying
- the categories of functions to be overloaded. It should be set
- to 1 to overload the mail function. 2 for string
- functions, 4 for regular expression functions. For example,
- if is set for 7, mail, strings and regular expression functions should
- be overloaded. The list of overloaded functions are shown below.
-
-
+
+ Function Overloading Feature
+
+
+ You might often find it difficult to get an existing PHP application
+ work in a given multibyte environment. That's mostly because lots of
+ PHP applications out there are written with the standard
+ string functions such as substr, which are
+ known to not properly handle multibyte-encoded strings.
+
+
+ mbstring supports 'function overloading' feature which enables
+ you to add multibyte awareness to such an application without
+ code modification by overloading multibyte counterparts on
+ the standard string functions. For example,
+ mb_substr is called instead of
+ substr if function overloading is enabled.
+ This feature makes it easy to port applications that only support
+ single-byte encodings to a multibyte environment in many cases.
+
+
+ To use the function overloading, set
+ mbstring.func_overload in &php.ini; to a
+ positive value that represents a combination of bitmasks specifying
+ the categories of functions to be overloaded. It should be set
+ to 1 to overload the mail function. 2 for string
+ functions, 4 for regular expression functions. For example,
+ if is set for 7, mail, strings and regular expression functions should
+ be overloaded. The list of overloaded functions are shown below.
+
+
+
+
+ It is not recommended to use the function overloading option in
+ the per-directory context, because it's not confirmed yet to be
+ stable enough in a production environment and may lead to undefined
+ behaviour.
+
+
- Basics of Japanese multi-byte encodings
-
- It is often said quite hard to figure out how Japanese texts are
- handled in the computer. This is not only because Japanese characters
- can only be represented by multibyte encodings, but because different
- encoding standards are adopted for different purposes / platforms.
- Moreover, not a few character set standards are used there, which
- are slightly different from one another. Those facts have often led
- developers to inevitable mess-up.
-
-
- To create a working web application that would be put in the Japanese
- environment, it is important to use the proper character encoding and
- character set for the task in hand.
-
-
-
-
- Storage for a character can be up to six bytes
-
-
-
- Most of multibyte characters often appear twice as wide as
- a single-byte character on display. Those characters are called
- "zen-kaku" in Japanese which means "full width", and the other
- (narrower) characters are called "han-kaku" - means half width.
- However the graphical properties of the characters depend on
- the glyphs of the type faces used to display them or print them out.
-
-
-
-
- Some character encodings use shift(escape) sequences defined
- in ISO2022 to switch the code map of the specific code area
- (00h to 7fh).
-
-
-
-
- ISO-2022-JP should be used in SMTP/NNTP, and headers and entities
- should be reencoded as per RFC requirements. Although those are not
- requisites, it's still a good idea because several popular user
- agents cannot recognize any other encoding methods.
-
-
-
-
- Webpages created for mobile phone services such as
- i-mode,
- Vodafone live!, or ezweb
- are supposed to use Shift_JIS.
-
-
-
-
+ Basics of Japanese multi-byte encodings
+
+ It is often said quite hard to figure out how Japanese texts are
+ handled in the computer. This is not only because Japanese characters
+ can only be represented by multibyte encodings, but because different
+ encoding standards are adopted for different purposes / platforms.
+ Moreover, not a few character set standards are used there, which
+ are slightly different from one another. Those facts have often led
+ developers to inevitable mess-up.
+
+
+ To create a working web application that would be put in the Japanese
+ environment, it is important to use the proper character encoding and
+ character set for the task in hand.
+
+
+
+
+ Storage for a character can be up to six bytes
+
+
+
+ Most of multibyte characters often appear twice as wide as
+ a single-byte character on display. Those characters are called
+ "zen-kaku" in Japanese which means "full width", and the other
+ (narrower) characters are called "han-kaku" - means half width.
+ However the graphical properties of the characters depend on
+ the glyphs of the type faces used to display them or print them out.
+
+
+
+
+ Some character encodings use shift(escape) sequences defined
+ in ISO2022 to switch the code map of the specific code area
+ (00h to 7fh).
+
+
+
+
+ ISO-2022-JP should be used in SMTP/NNTP, and headers and entities
+ should be reencoded as per RFC requirements. Although those are not
+ requisites, it's still a good idea because several popular user
+ agents cannot recognize any other encoding methods.
+
+
+
+
+ Webpages created for mobile phone services such as
+ i-mode,
+ Vodafone live!, or EZweb
+ are supposed to use Shift_JIS.
+
+
+
+
- References
-
- Multibyte character encoding schemes and the related issues are very
- complicated. There should be too few space to cover in sufficient details.
- Please refer to the following URLs and other resources for
- further readings.
-
-
-
- Unicode materials
-
-
- &url.unicode;
-
-
-
-
- Japanese/Korean/Chinese character information
-
-
-
-
- ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
-
-
-
-
-
-
+ References
+
+ Multibyte character encoding schemes and the related issues are very
+ complicated. There should be too few space to cover in sufficient details.
+ Please refer to the following URLs and other resources for
+ further readings.
+
+
+
+ Unicode materials
+
+
+ &url.unicode;
+
+
+
+
+ Japanese/Korean/Chinese character information
+
+
+ http://examples.oreilly.com/cjkvinfo/doc/cjk.inf
+
+
+
+
+&reference.mbstring.encodings;
+
&reference.mbstring.functions;