Improve mb_strwidth() docs

* clarify the meaning of "width" in this context
* fix character width info (cf.
  https://github.com/php/php-src/blob/master/ext/mbstring/libmbfl/mbfl/eaw_table.h)
* remove the misinformation that multibyte characters would usually be
  twice as wide than single byte characters

git-svn-id: https://svn.php.net/repository/phpdoc/en/trunk@344481 c90b9560-bf6c-de11-be94-00142212c4b1
This commit is contained in:
Christoph Michael Becker 2018-03-11 15:42:33 +00:00
parent 21e01236e0
commit 7644e2087c

View file

@ -14,45 +14,52 @@
<methodparam choice="opt"><type>string</type><parameter>encoding</parameter><initializer>mb_internal_encoding()</initializer></methodparam>
</methodsynopsis>
<para>
Returns the width of <type>string</type> <parameter>str</parameter>.
Returns the width of <type>string</type> <parameter>str</parameter>,
where halfwidth characters count as <literal>1</literal>, and fullwidth
characters count as <literal>2</literal>.
</para>
<para>
Multi-byte characters are usually twice the width of single byte characters.
</para>
<para>
<table>
<title>Characters width</title>
<tgroup cols="2">
<thead>
<row>
<entry>Chars</entry>
<entry>Width</entry>
</row>
</thead>
<tbody>
<row>
<entry>U+0000 - U+0019</entry>
<entry>0</entry>
</row>
<row>
<entry>U+0020 - U+1FFF</entry>
<entry>1</entry>
</row>
<row>
<entry>U+2000 - U+FF60</entry>
<entry>2</entry>
</row>
<row>
<entry>U+FF61 - U+FF9F</entry>
<entry>1</entry>
</row>
<row>
<entry>U+FFA0 - </entry>
<entry>2</entry>
</row>
</tbody>
</tgroup>
</table>
The fullwidth characters are:
<literal>U+1100</literal>-<literal>U+115F</literal>,
<literal>U+11A3</literal>-<literal>U+11A7</literal>,
<literal>U+11FA</literal>-<literal>U+11FF</literal>,
<literal>U+2329</literal>-<literal>U+232A</literal>,
<literal>U+2E80</literal>-<literal>U+2E99</literal>,
<literal>U+2E9B</literal>-<literal>U+2EF3</literal>,
<literal>U+2F00</literal>-<literal>U+2FD5</literal>,
<literal>U+2FF0</literal>-<literal>U+2FFB</literal>,
<literal>U+3000</literal>-<literal>U+303E</literal>,
<literal>U+3041</literal>-<literal>U+3096</literal>,
<literal>U+3099</literal>-<literal>U+30FF</literal>,
<literal>U+3105</literal>-<literal>U+312D</literal>,
<literal>U+3131</literal>-<literal>U+318E</literal>,
<literal>U+3190</literal>-<literal>U+31BA</literal>,
<literal>U+31C0</literal>-<literal>U+31E3</literal>,
<literal>U+31F0</literal>-<literal>U+321E</literal>,
<literal>U+3220</literal>-<literal>U+3247</literal>,
<literal>U+3250</literal>-<literal>U+32FE</literal>,
<literal>U+3300</literal>-<literal>U+4DBF</literal>,
<literal>U+4E00</literal>-<literal>U+A48C</literal>,
<literal>U+A490</literal>-<literal>U+A4C6</literal>,
<literal>U+A960</literal>-<literal>U+A97C</literal>,
<literal>U+AC00</literal>-<literal>U+D7A3</literal>,
<literal>U+D7B0</literal>-<literal>U+D7C6</literal>,
<literal>U+D7CB</literal>-<literal>U+D7FB</literal>,
<literal>U+F900</literal>-<literal>U+FAFF</literal>,
<literal>U+FE10</literal>-<literal>U+FE19</literal>,
<literal>U+FE30</literal>-<literal>U+FE52</literal>,
<literal>U+FE54</literal>-<literal>U+FE66</literal>,
<literal>U+FE68</literal>-<literal>U+FE6B</literal>,
<literal>U+FF01</literal>-<literal>U+FF60</literal>,
<literal>U+FFE0</literal>-<literal>U+FFE6</literal>,
<literal>U+1B000</literal>-<literal>U+1B001</literal>,
<literal>U+1F200</literal>-<literal>U+1F202</literal>,
<literal>U+1F210</literal>-<literal>U+1F23A</literal>,
<literal>U+1F240</literal>-<literal>U+1F248</literal>,
<literal>U+1F250</literal>-<literal>U+1F251</literal>,
<literal>U+20000</literal>-<literal>U+2FFFD</literal>,
<literal>U+30000</literal>-<literal>U+3FFFD</literal>.
All other characters are halfwidth characters.
</para>
</refsect1>