mirror of
https://github.com/sigmasternchen/php-doc-en
synced 2025-03-15 16:38:54 +00:00
688 lines
26 KiB
XML
688 lines
26 KiB
XML
<?xml version="1.0" encoding="utf-8"?>
|
|
<!-- $Revision$ -->
|
|
<chapter xml:id="features.gc" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
|
|
<title>Garbage Collection</title>
|
|
|
|
<para>
|
|
This section explains the merits of the new Garbage Collection (also known
|
|
as GC) mechanism that is part of PHP 5.3.
|
|
</para>
|
|
|
|
<sect1 xml:id="features.gc.refcounting-basics">
|
|
<title>Reference Counting Basics</title>
|
|
<para>
|
|
A PHP variable is stored in a container called a "zval". A zval container
|
|
contains, besides the variable's type and value, two additional bits of
|
|
information. The first is called "is_ref" and is a boolean value
|
|
indicating whether or not the variable is part of a "reference set". With
|
|
this bit, PHP's engine knows how to differentiate between normal variables
|
|
and references. Since PHP allows user-land references, as created by the
|
|
& operator, a zval container also has an internal reference counting
|
|
mechanism to optimize memory usage. This second piece of additional
|
|
information, called "refcount", contains how many variable names (also
|
|
called symbols) point to this one zval container. All symbols are stored in
|
|
a symbol table, of which there is one per scope. There is a scope for the
|
|
main script (i.e., the one requested through the browser), as well as one
|
|
for every function or method.
|
|
</para>
|
|
<para>
|
|
A zval container is created when a new variable is created with a constant
|
|
value, such as:
|
|
<example>
|
|
<title>Creating a new zval container</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = "new string";
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
In this case, the new symbol name, <literal>a</literal>, is created in the current scope,
|
|
and a new variable container is created with the type <type>string</type> and the value
|
|
<literal>new string</literal>. The "is_ref" bit is by default set to &false; because no
|
|
user-land reference has been created. The "refcount" is set to <literal>1</literal> as
|
|
there is only one symbol that makes use of this variable container. Note
|
|
that references (i.e. "is_ref" is &true;) with "refcount" <literal>1</literal>, are
|
|
treated as if they are not references (i.e. as "is_ref" was &false;). If you have <link
|
|
xlink:href="&url.xdebug;">Xdebug</link> installed, you can display this
|
|
information by calling <function>xdebug_debug_zval</function>.
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Displaying zval information</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = "new string";
|
|
xdebug_debug_zval('a');
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=1, is_ref=0)='new string'
|
|
]]>
|
|
</screen>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
Assigning this variable to another variable name will increase the refcount.
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Increasing refcount of a zval</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = "new string";
|
|
$b = $a;
|
|
xdebug_debug_zval( 'a' );
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=2, is_ref=0)='new string'
|
|
]]>
|
|
</screen>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
The refcount is <literal>2</literal> here, because the same variable container is linked
|
|
with both <varname>a</varname> and <varname>b</varname>.
|
|
PHP is smart enough not to copy the actual variable
|
|
container when it is not necessary. Variable containers get destroyed when
|
|
the "refcount" reaches zero. The "refcount" gets decreased by one when any
|
|
symbol linked to the variable container leaves the scope (e.g. when the
|
|
function ends) or when a symbol is unassigned (e.g. by calling <function>unset</function>).
|
|
The following example shows this:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Decreasing zval refcount</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = "new string";
|
|
$c = $b = $a;
|
|
xdebug_debug_zval( 'a' );
|
|
$b = 42;
|
|
xdebug_debug_zval( 'a' );
|
|
unset( $c );
|
|
xdebug_debug_zval( 'a' );
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=3, is_ref=0)='new string'
|
|
a: (refcount=2, is_ref=0)='new string'
|
|
a: (refcount=1, is_ref=0)='new string'
|
|
]]>
|
|
</screen>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
If we now call <literal>unset($a);</literal>, the variable container, including the type
|
|
and value, will be removed from memory.
|
|
</para>
|
|
|
|
<sect2 xml:id="features.gc.compound-types">
|
|
<title>Compound Types</title>
|
|
|
|
<para>
|
|
Things get a tad more complex with compound types such as <type>array</type>s and
|
|
<type>object</type>s. As opposed to <type>scalar</type> values, <type>array</type>s
|
|
and <type>object</type>s store their
|
|
properties in a symbol table of their own. This means that the following
|
|
example creates three zval containers:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Creating a <type>array</type> zval</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = array( 'meaning' => 'life', 'number' => 42 );
|
|
xdebug_debug_zval( 'a' );
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs.similar;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=1, is_ref=0)=array (
|
|
'meaning' => (refcount=1, is_ref=0)='life',
|
|
'number' => (refcount=1, is_ref=0)=42
|
|
)
|
|
]]>
|
|
</screen>
|
|
<para>Or graphically</para>
|
|
<mediaobject>
|
|
<alt>Zvals for a simple array</alt>
|
|
<imageobject>
|
|
<imagedata fileref="en/features/figures/simple-array.png" format="PNG"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
The three zval containers are: <varname>a</varname>, <varname>meaning</varname>, and <varname>number</varname>.
|
|
Similar rules apply for increasing and decreasing "refcounts". Below, we add another
|
|
element to the array, and set its value to the contents of an already
|
|
existing element:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Adding already existing element to an array</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = array( 'meaning' => 'life', 'number' => 42 );
|
|
$a['life'] = $a['meaning'];
|
|
xdebug_debug_zval( 'a' );
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs.similar;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=1, is_ref=0)=array (
|
|
'meaning' => (refcount=2, is_ref=0)='life',
|
|
'number' => (refcount=1, is_ref=0)=42,
|
|
'life' => (refcount=2, is_ref=0)='life'
|
|
)
|
|
]]>
|
|
</screen>
|
|
<para>Or graphically</para>
|
|
<mediaobject>
|
|
<alt>Zvals for a simple array with a reference</alt>
|
|
<imageobject>
|
|
<imagedata fileref="en/features/figures/simple-array2.png" format="PNG"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
From the above Xdebug output, we see that both the old and new array
|
|
elements now point to a zval container whose "refcount" is
|
|
<literal>2</literal>. Although Xdebug's output shows two zval containers
|
|
with value <literal>'life'</literal>, they are the same one. The
|
|
<function>xdebug_debug_zval</function> function does not show this, but
|
|
you could see it by also displaying the memory pointer.
|
|
</para>
|
|
<para>
|
|
Removing an element from the array is like removing a symbol from a scope.
|
|
By doing so, the "refcount" of a container that an array element points to
|
|
is decreased. Again, when the "refcount" reaches zero, the variable
|
|
container is removed from memory. Again, an example to show this:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Removing an element from an array</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = array( 'meaning' => 'life', 'number' => 42 );
|
|
$a['life'] = $a['meaning'];
|
|
unset( $a['meaning'], $a['number'] );
|
|
xdebug_debug_zval( 'a' );
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs.similar;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=1, is_ref=0)=array (
|
|
'life' => (refcount=1, is_ref=0)='life'
|
|
)
|
|
]]>
|
|
</screen>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
Now, things get interesting if we add the array itself as an element of
|
|
the array, which we do in the next example, in which we also sneak in a
|
|
reference operator, since otherwise PHP would create a copy:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Adding the array itself as an element of it self</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
$a = array( 'one' );
|
|
$a[] =& $a;
|
|
xdebug_debug_zval( 'a' );
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
&example.outputs.similar;
|
|
<screen>
|
|
<![CDATA[
|
|
a: (refcount=2, is_ref=1)=array (
|
|
0 => (refcount=1, is_ref=0)='one',
|
|
1 => (refcount=2, is_ref=1)=...
|
|
)
|
|
]]>
|
|
</screen>
|
|
<para>Or graphically</para>
|
|
<mediaobject>
|
|
<alt>Zvals for an array with a circular reference</alt>
|
|
<imageobject>
|
|
<imagedata fileref="en/features/figures/loop-array.png" format="PNG"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
You can see that the array variable (<varname>a</varname>) as well as the second element
|
|
(<varname>1</varname>) now point to a variable container that has a "refcount" of <literal>2</literal>. The
|
|
"..." in the display above shows that there is recursion involved, which,
|
|
of course, in this case means that the "..." points back to the original
|
|
array.
|
|
</para>
|
|
<para>
|
|
Just like before, unsetting a variable removes the symbol, and the
|
|
reference count of the variable container it points to is decreased by
|
|
one. So, if we unset variable <varname>$a</varname> after running the above code, the
|
|
reference count of the variable container that <varname>$a</varname> and element "1" point to
|
|
gets decreased by one, from "2" to "1". This can be represented as:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Unsetting <varname>$a</varname></title>
|
|
<screen>
|
|
<![CDATA[
|
|
(refcount=1, is_ref=1)=array (
|
|
0 => (refcount=1, is_ref=0)='one',
|
|
1 => (refcount=1, is_ref=1)=...
|
|
)
|
|
]]>
|
|
</screen>
|
|
<para>Or graphically</para>
|
|
<mediaobject>
|
|
<alt>Zvals after removal of array with a circular reference demonstrating the memory leak</alt>
|
|
<imageobject>
|
|
<imagedata fileref="en/features/figures/leak-array.png" format="PNG"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
</example>
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 xml:id="features.gc.cleanup-problems">
|
|
<title>Cleanup Problems</title>
|
|
<para>
|
|
Although there is no longer a symbol in any scope pointing to this
|
|
structure, it cannot be cleaned up because the array element "1" still
|
|
points to this same array. Because there is no external symbol pointing to
|
|
it, there is no way for a user to clean up this structure; thus you get a
|
|
memory leak. Fortunately, PHP will clean up this data structure at the end
|
|
of the request, but before then, this is taking up valuable space in
|
|
memory. This situation happens often if you're implementing parsing
|
|
algorithms or other things where you have a child point back at a "parent"
|
|
element. The same situation can also happen with objects of course, where
|
|
it actually is more likely to occur, as objects are always implicitly used
|
|
by reference.
|
|
</para>
|
|
<para>
|
|
This might not be a problem if this only happens once or twice, but if
|
|
there are thousands, or even millions, of these memory losses, this
|
|
obviously starts being a problem. This is especially problematic in long
|
|
running scripts, such as daemons where the request basically never ends,
|
|
or in large sets of unit tests. The latter caused problems while
|
|
running the unit tests for the Template component of the eZ Components
|
|
library. In some cases, it would require over 2 GB of memory, which the
|
|
test server didn't quite have.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 xml:id="features.gc.collecting-cycles">
|
|
<title>Collecting Cycles</title>
|
|
<para>
|
|
Traditionally, reference counting memory mechanisms, such as that used
|
|
previously by PHP, fail to address circular reference memory leaks;
|
|
however, as of 5.3.0, PHP implements the synchronous algorithm from the
|
|
<link xlink:href="&url.gc-paper;">Concurrent Cycle Collection in Reference Counted Systems</link>
|
|
paper which addresses that issue.
|
|
</para>
|
|
<para>
|
|
A full explanation of how the algorithm works would be slightly beyond the
|
|
scope of this section, but the basics are explained here. First of all,
|
|
we have to establish a few ground rules. If a refcount is increased, it's
|
|
still in use and therefore, not garbage. If the refcount is decreased and
|
|
hits zero, the zval can be freed. This means that garbage cycles can only
|
|
be created when a refcount argument is decreased to a non-zero value.
|
|
Secondly, in a garbage cycle, it is possible to discover which parts are
|
|
garbage by checking whether it is possible to decrease their refcount by
|
|
one, and then checking which of the zvals have a refcount of zero.
|
|
</para>
|
|
<para>
|
|
<mediaobject>
|
|
<alt>Garbage collection algorithm</alt>
|
|
<imageobject>
|
|
<imagedata fileref="en/features/figures/gc-algorithm.png" format="PNG"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
</para>
|
|
<para>
|
|
To avoid having to call the checking of garbage cycles with every possible
|
|
decrease of a refcount, the algorithm instead puts all possible roots
|
|
(zvals) in the "root buffer" (marking them "purple"). It also makes sure
|
|
that each possible garbage root ends up in the buffer only once. Only when
|
|
the root buffer is full does the collection mechanism start for all the
|
|
different zvals inside. See step A in the figure above.
|
|
</para>
|
|
<para>
|
|
In step B, the algorithm runs a depth-first search on all possible roots
|
|
to decrease by one the refcounts of each zval it finds, making sure not to
|
|
decrease a refcount on the same zval twice (by marking them as "grey"). In
|
|
step C, the algorithm again runs a depth-first search from each root node,
|
|
to check the refcount of each zval again. If it finds that the refcount is
|
|
zero, the zval is marked "white" (blue in the figure). If it's larger than
|
|
zero, it reverts the decreasing of the refcount by one with a depth-first
|
|
search from that point on, and they are marked "black" again. In the last
|
|
step (D), the algorithm walks over the root buffer removing the zval roots
|
|
from there, and meanwhile, checks which zvals have been marked "white" in
|
|
the previous step. Every zval marked as "white" will be freed.
|
|
</para>
|
|
<para>
|
|
Now that you have a basic understanding of how the algorithm works, we
|
|
will look back at how this integrates with PHP. By default, PHP's garbage
|
|
collector is turned on. There is, however, a &php.ini;
|
|
setting that allows you to change this:
|
|
<link linkend="ini.zend.enable-gc">zend.enable_gc</link>.
|
|
</para>
|
|
<para>
|
|
When the garbage collector is turned on, the cycle-finding algorithm as
|
|
described above is executed whenever the root buffer runs full. The root
|
|
buffer has a fixed size of 10,000 possible roots (although you can alter
|
|
this by changing the <literal>GC_ROOT_BUFFER_MAX_ENTRIES</literal> constant in
|
|
<literal>Zend/zend_gc.c</literal> in the PHP source code, and re-compiling
|
|
PHP). When the garbage collector is turned off, the cycle-finding
|
|
algorithm will never run. However, possible roots will always be recorded
|
|
in the root buffer, no matter whether the garbage collection mechanism has
|
|
been activated with this configuration setting.
|
|
</para>
|
|
<para>
|
|
If the root buffer becomes full with possible roots while the garbage
|
|
collection mechanism is turned off, further possible roots will simply not
|
|
be recorded. Those possible roots that are not recorded will never be
|
|
analyzed by the algorithm. If they were part of a circular reference
|
|
cycle, they would never be cleaned up and would create a memory leak.
|
|
</para>
|
|
<para>
|
|
The reason why possible roots are recorded even if the mechanism has been
|
|
disabled is because it's faster to record possible roots than to have to
|
|
check whether the mechanism is turned on every time a possible root could
|
|
be found. The garbage collection and analysis mechanism itself, however,
|
|
can take a considerable amount of time.
|
|
</para>
|
|
<para>
|
|
Besides changing the <link linkend="ini.zend.enable-gc">zend.enable_gc</link> configuration
|
|
setting, it is also possible to turn the garbage collecting mechanism on
|
|
and off by calling <function>gc_enable</function> or
|
|
<function>gc_disable</function> respectively. Calling those functions has
|
|
the same effect as turning on or off the mechanism with the configuration
|
|
setting. It is also possible to force the collection of cycles even if the
|
|
possible root buffer is not full yet. For this, you can use the
|
|
<function>gc_collect_cycles</function> function. This function will return
|
|
how many cycles were collected by the algorithm.
|
|
</para>
|
|
<para>
|
|
The rationale behind the ability to turn the mechanism on and off, and to
|
|
initiate cycle collection yourself, is that some parts of your application
|
|
could be highly time-sensitive. In those cases, you might not want the
|
|
garbage collection mechanism to kick in. Of course, by turning off the
|
|
garbage collection for certain parts of your application, you do risk
|
|
creating memory leaks because some possible roots might not fit into the
|
|
limited root buffer. Therefore, it is probably wise to call
|
|
<function>gc_collect_cycles</function> just before you call
|
|
<function>gc_disable</function> to free up the memory that could be lost
|
|
through possible roots that are already recorded in the root buffer. This
|
|
then leaves an empty buffer so that there is more space for storing
|
|
possible roots while the cycle collecting mechanism is turned off.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 xml:id="features.gc.performance-considerations">
|
|
<title>Performance Considerations</title>
|
|
<para>
|
|
We have already mentioned in the previous section that simply collecting the
|
|
possible roots had a very tiny performance impact, but this is when you
|
|
compare PHP 5.2 against PHP 5.3. Although the recording of possible roots
|
|
compared to not recording them at all, like in PHP 5.2, is slower, other
|
|
changes to the PHP runtime in PHP 5.3 prevented this particular performance
|
|
loss from even showing.
|
|
</para>
|
|
<para>
|
|
There are two major areas in which performance is affected. The first
|
|
area is reduced memory usage, and the second area is run-time delay when
|
|
the garbage collection mechanism performs its memory cleanups. We will
|
|
look at both of those issues.
|
|
</para>
|
|
|
|
<sect2 xml:id="features.gc.performance-considerations.reduced-mem">
|
|
<title>Reduced Memory Usage</title>
|
|
<para>
|
|
First of all, the whole reason for implementing the garbage collection
|
|
mechanism is to reduce memory usage by cleaning up circular-referenced
|
|
variables as soon as the prerequisites are fulfilled. In PHP's
|
|
implementation, this happens as soon as the root-buffer is full, or when
|
|
the function <function>gc_collect_cycles</function> is called. In
|
|
the graph below, we display the memory usage of the script below,
|
|
in both PHP 5.2 and PHP 5.3, excluding the base memory that PHP
|
|
itself uses when starting up.
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Memory usage example</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
class Foo
|
|
{
|
|
public $var = '3.14159265359';
|
|
}
|
|
|
|
$baseMemory = memory_get_usage();
|
|
|
|
for ( $i = 0; $i <= 100000; $i++ )
|
|
{
|
|
$a = new Foo;
|
|
$a->self = $a;
|
|
if ( $i % 500 === 0 )
|
|
{
|
|
echo sprintf( '%8d: ', $i ), memory_get_usage() - $baseMemory, "\n";
|
|
}
|
|
}
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
<mediaobject>
|
|
<alt>Comparison of memory usage between PHP 5.2 and PHP 5.3</alt>
|
|
<imageobject>
|
|
<imagedata fileref="en/features/figures/gc-benchmark.png" format="PNG"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
In this very academic example, we are creating an object in which a
|
|
property is set to point back to the object itself. When the <varname>$a</varname> variable
|
|
in the script is re-assigned in the next iteration of the loop, a memory
|
|
leak would typically occur. In this case, two zval-containers are leaked
|
|
(the object zval, and the property zval), but only one possible root is
|
|
found: the variable that was unset. When the root-buffer is full after
|
|
10,000 iterations (with a total of 10,000 possible roots), the garbage
|
|
collection mechanism kicks in and frees the memory associated with those
|
|
possible roots. This can very clearly be seen in the jagged memory-usage
|
|
graph for PHP 5.3. After each 10,000 iterations, the mechanism kicks in
|
|
and frees the memory associated with the circular referenced variables.
|
|
The mechanism itself does not have to do a whole lot of work in this
|
|
example, because the structure that is leaked is extremely simple. From
|
|
the diagram, you see that the maximum memory usage in PHP 5.3 is about 9
|
|
Mb, whereas in PHP 5.2 the memory usage keeps increasing.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 xml:id="features.gc.performance-considerations.slowdowns">
|
|
<title>Run-Time Slowdowns</title>
|
|
<para>
|
|
The second area where the garbage collection mechanism influences
|
|
performance is the time taken when the garbage collection mechanism
|
|
kicks in to free the "leaked" memory. In order to see how much this is,
|
|
we slightly modify the previous script to allow for a larger number of
|
|
iterations and the removal of the intermediate memory usage figures. The
|
|
second script is here:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>GC performance influences</title>
|
|
<programlisting role="php">
|
|
<![CDATA[
|
|
<?php
|
|
class Foo
|
|
{
|
|
public $var = '3.14159265359';
|
|
}
|
|
|
|
for ( $i = 0; $i <= 1000000; $i++ )
|
|
{
|
|
$a = new Foo;
|
|
$a->self = $a;
|
|
}
|
|
|
|
echo memory_get_peak_usage(), "\n";
|
|
?>
|
|
]]>
|
|
</programlisting>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
We will run this script two times, once with the
|
|
<link linkend="ini.zend.enable-gc">zend.enable_gc</link> setting turned
|
|
on, and once with it turned off:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Running the above script</title>
|
|
<programlisting role="shell">
|
|
<![CDATA[
|
|
time php -dzend.enable_gc=0 -dmemory_limit=-1 -n example2.php
|
|
# and
|
|
time php -dzend.enable_gc=1 -dmemory_limit=-1 -n example2.php
|
|
]]>
|
|
</programlisting>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
On my machine, the first command seems to take consistently about 10.7
|
|
seconds, whereas the second command takes about 11.4 seconds. This is a
|
|
slowdown of about 7%. However, the maximum amount of memory used by the
|
|
script is reduced by 98% from 931Mb to 10Mb. This benchmark is not very
|
|
scientific, or even representative of real-life applications, but it
|
|
does demonstrate the memory usage benefits that this garbage collection
|
|
mechanism provides. The good thing is that the slowdown is always the
|
|
same 7%, for this particular script, while the memory saving
|
|
capabilities save more and more memory as more circular references are
|
|
found during script execution.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 xml:id="features.gc.performance-considerations.internal-stats">
|
|
<title>PHP's Internal GC Statistics</title>
|
|
<para>
|
|
It is possible to coax a little bit more information about how the
|
|
garbage collection mechanism is run from within PHP. But in order to do
|
|
so, you will have to re-compile PHP to enable the benchmark and
|
|
data-collecting code. You will have to set the <literal>CFLAGS</literal>
|
|
environment variable to <literal>-DGC_BENCH=1</literal> prior to running
|
|
<literal>./configure</literal> with your desired options. The following
|
|
sequence should do the trick:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>Recompiling PHP to enable GC benchmarking</title>
|
|
<programlisting role="shell">
|
|
<![CDATA[
|
|
export CFLAGS=-DGC_BENCH=1
|
|
./config.nice
|
|
make clean
|
|
make
|
|
]]>
|
|
</programlisting>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
When you run the above example code again with the newly built PHP
|
|
binary, you will see the following being shown after PHP has finished
|
|
execution:
|
|
</para>
|
|
<para>
|
|
<example>
|
|
<title>GC statistics</title>
|
|
<programlisting role="shell">
|
|
<![CDATA[
|
|
GC Statistics
|
|
-------------
|
|
Runs: 110
|
|
Collected: 2072204
|
|
Root buffer length: 0
|
|
Root buffer peak: 10000
|
|
|
|
Possible Remove from Marked
|
|
Root Buffered buffer grey
|
|
-------- -------- ----------- ------
|
|
ZVAL 7175487 1491291 1241690 3611871
|
|
ZOBJ 28506264 1527980 677581 1025731
|
|
]]>
|
|
</programlisting>
|
|
</example>
|
|
</para>
|
|
<para>
|
|
The most informative statistics are displayed in the first block. You
|
|
can see here that the garbage collection mechanism ran 110 times, and in
|
|
total, more than 2 million memory allocations were freed during those
|
|
110 runs. As soon as the garbage collection mechanism has run at least
|
|
one time, the "Root buffer peak" is always 10000.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 xml:id="features.gc.performance-considerations.conclusion">
|
|
<title>Conclusion</title>
|
|
<para>
|
|
In general the garbage collector in PHP will only cause a slowdown when the
|
|
cycle collecting algorithm actually runs, whereas in normal (smaller)
|
|
scripts there should be no performance hit at all.
|
|
</para>
|
|
<para>
|
|
However, in the cases where the cycle collection mechanism does run for
|
|
normal scripts, the memory reduction it will provide allows more of
|
|
those scripts to run concurrently on your server, since not so much
|
|
memory is used in total.
|
|
</para>
|
|
<para>
|
|
The benefits are most apparent for longer-running scripts, such as
|
|
lengthy test suites or daemon scripts. Also, for <link xlink:href="&url.php.gtk;">PHP-GTK</link> applications
|
|
that generally tend to run longer than scripts for the Web, the new
|
|
mechanism should make quite a bit of a difference regarding memory leaks
|
|
creeping in over time.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
</chapter>
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
vim600: syn=xml fen fdm=syntax fdl=2 si
|
|
vim: et tw=78 syn=sgml
|
|
vi: ts=1 sw=1
|
|
-->
|