Rewrite debug_zval_dump manual based on current (>=7.0) behaviour

- The recommended use of a call-time pass-by-reference has been
  impossible since PHP 5.4, making the examples unusable.
- The re-design of references in PHP 7.0 means that the refcount
  no longer reflects these.

Closes GH-466.
This commit is contained in:
Rowan Tommins 2021-02-28 21:40:01 +00:00 committed by Christoph M. Becker
parent 95bc76b545
commit d08d2e887f

View file

@ -3,7 +3,7 @@
<refentry xml:id="function.debug-zval-dump" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<refnamediv>
<refname>debug_zval_dump</refname>
<refpurpose>Dumps a string representation of an internal zend value to output</refpurpose>
<refpurpose>Dumps a string representation of an internal zval structure to output</refpurpose>
</refnamediv>
<refsect1 role="description">
&reftitle.description;
@ -13,7 +13,9 @@
<methodparam rep="repeat"><type>mixed</type><parameter>values</parameter></methodparam>
</methodsynopsis>
<para>
Dumps a string representation of an internal zend value to output.
Dumps a string representation of an internal zval (Zend value) structure to output.
This is mostly useful for understanding or debugging implementation details of the
Zend Engine or PHP extensions.
</para>
</refsect1>
<refsect1 role="parameters">
@ -24,7 +26,7 @@
<term><parameter>value</parameter></term>
<listitem>
<para>
The variable to dump.
The variable or value to dump.
</para>
</listitem>
</varlistentry>
@ -32,7 +34,7 @@
<term><parameter>values</parameter></term>
<listitem>
<para>
Further variables to dump.
Further variables or values to dump.
</para>
</listitem>
</varlistentry>
@ -53,36 +55,60 @@
<programlisting role="php">
<![CDATA[
<?php
$var1 = 'Hello World';
$var2 = '';
$var1 = 'Hello';
$var1 .= ' World';
$var2 = $var1;
$var2 =& $var1;
debug_zval_dump(&$var1);
debug_zval_dump($var1);
?>
]]>
</programlisting>
&example.outputs;
<screen>
<![CDATA[
&string(11) "Hello World" refcount(3)
string(11) "Hello World" refcount(3)
]]>
</screen>
</example>
</para>
<note>
<title>Beware the <literal>refcount</literal></title>
<title>Understanding the <literal>refcount</literal></title>
<para>
The <literal>refcount</literal> value returned by this function is
non-obvious in certain circumstances. For example, a developer might
expect the above example to indicate a <literal>refcount</literal> of
<literal>2</literal>. The third reference is created when actually
calling <function>debug_zval_dump</function>.
The <literal>refcount</literal> value shown by this function may be
surprising without a detailed understanding of the engine's implementation.
</para>
<para>
This behavior is further compounded when a variable is not passed to
<function>debug_zval_dump</function> by reference. To illustrate, consider
a slightly modified version of the above example:
The Zend Engine uses reference counting for two different purposes:
</para>
<para>
<simplelist>
<member>
Optimizing memory usage using a technique called "copy on write",
where multiple variables holding the same value point to the same copy
in memory. When any of the variables is modified, it is pointed to a new
copy in memory, and the reference count on the original is decreased by 1.
</member>
<member>
Tracking variables which have been assigned or passed by reference (see
<link linkend="language.references">References Explained</link>). This
refcount is stored on a separate reference zval, pointing to the zval
for the current value. This additional zval is not currently shown by
<function>debug_zval_dump</function>.
</member>
</simplelist>
</para>
<para>
Because <function>debug_zval_dump</function> takes its input as normal
parameters, passed by value, the copy on write technique will be used
to pass them: rather than copying the data, the refcount will be increased
by one for the lifetime of the function call. If the function modified the
parameter after receiving it, then a copy would be made; since it does not,
it will show a refcount one higher than in the calling scope.
</para>
<para>
The parameter passing also prevents <function>debug_zval_dump</function>
showing variables which have been assigned by reference. To illustrate,
consider a slightly modified version of the above example:
</para>
<para>
<example>
@ -90,39 +116,11 @@ debug_zval_dump(&$var1);
<programlisting role="php">
<![CDATA[
<?php
$var1 = 'Hello World';
$var2 = '';
$var1 = 'Hello';
$var1 .= ' World';
// Point three variables as references to the same value
$var2 =& $var1;
debug_zval_dump($var1); // not passed by reference, this time
?>
]]>
</programlisting>
&example.outputs;
<screen>
<![CDATA[
string(11) "Hello World" refcount(1)
]]>
</screen>
</example>
</para>
<para>
Why <literal>refcount(1)</literal>? Because a copy of <literal>$var1</literal> is
being made, when the function is called.
</para>
<para>
This function becomes even <emphasis>more</emphasis> confusing when a
variable with a <literal>refcount</literal> of <literal>1</literal> is
passed (by copy/value):
</para>
<para>
<example>
<title/>
<programlisting role="php">
<![CDATA[
<?php
$var1 = 'Hello World';
$var3 =& $var1;
debug_zval_dump($var1);
?>
@ -137,25 +135,18 @@ string(11) "Hello World" refcount(2)
</example>
</para>
<para>
A <literal>refcount</literal> of <literal>2</literal>, here, is extremely
non-obvious. Especially considering the above examples. So what's
happening?
Although <varname>$var1</varname>, <varname>$var2</varname>, and
<varname>$var3</varname> are linked as references, only the
<emphasis>value</emphasis> is passed to <function>debug_zval_dump</function>.
That value is used once by the set of references, and once inside the
<function>debug_zval_dump</function>, so shows a refcount of 2.
</para>
<para>
When a variable has a single reference (as did <literal>$var1</literal>
before it was used as an argument to <function>debug_zval_dump</function>),
PHP's engine optimizes the manner in which it is passed to a function.
Internally, PHP treats <literal>$var1</literal> like a reference (in that
the <literal>refcount</literal> is increased for the scope of this
function), with the caveat that <emphasis>if</emphasis> the passed reference
happens to be written to, a copy is made, but only at the moment of
writing. This is known as "copy on write."
</para>
<para>
So, if <function>debug_zval_dump</function> happened to write to its sole
parameter (and it doesn't), then a copy would be made. Until then, the
parameter remains a reference, causing the <literal>refcount</literal> to
be incremented to <literal>2</literal> for the scope of the function call.
Further complications arise because of optimisations made in the engine for
different data types. Some types such as integers do not use "copy on write",
so do not show a refcount at all. In other cases, the refcount shows extra
copies used internally, such as when a literal string or array is stored as
part of a code instruction.
</para>
</note>
</refsect1>