php-doc-en/reference/mysqlnd_ms/concepts.xml

702 lines
27 KiB
XML
Raw Normal View History

<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision$ -->
<chapter xml:id="mysqlnd-ms.concepts" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<title>Concepts</title>
<para>
The concept section explains the overall architecture and important concepts
of the plugin. The materials aim to help you understanding the impact of
MySQL replication and using the plugin for your development tasks.
Any application using MySQL replication must take care of certain tasks that
arise from using a database cluster.
</para>
<para>
It is strongly recommended to work through the materials in order to
be able to use the plugin with success. This is particulary true, if you are
new to using MySQL replication.
</para>
<section xml:id="mysqlnd-ms.architecture">
<title>Architecture</title>
<para>
The mysqlnd replication and load balancing plugin is
implemented as a PHP extension.
It is written in C and operates under the hood of PHP. During the
startup of the PHP interpreter, in the module init phase of the
PHP engine, it gets registered as a
<link linkend="book.mysqlnd">mysqlnd</link> plugin to replace selected
mysqlnd C methods.
</para>
<para>
At PHP run time it inspects queries send from
mysqlnd (PHP) to the MySQL server. If a query is recognized as read-only
it will be sent to one of the configured slave servers. Statements are
considered read-only if they either start with <literal>SELECT</literal>,
the SQL hint <literal>/*ms=slave*/</literal> or a slave had been choose for
running the previous query and the query starts with the SQL hint
<literal>/*ms=last_used*/</literal>. In all other cases the query will
be sent to the MySQL replication master server.
</para>
<para>
The plugin takes care internally of opening and closing the database connections
to the master server and the slave servers. From an application
point of view there continues to be only one connection handle. However,
internally, this one public connection handle represents a pool of
internal connections managed by the plugin. The plugin proxies queries
to the master server and the slave ones using multiple connections.
</para>
<para>
Database connections have a state consisting, for example, transaction
status, transaction settings, character set settings, temporary tables.
The plugin will try to maintain the same state among all internal
connections, whenever this can be done in an automatic and transparent way.
In cases where it is not easily possible to maintain state among all
connections, such as when using <literal>BEGIN TRANSACTION</literal>, the
plugin leaves it to the user to handle. Please, find further details below.
</para>
</section>
<section xml:id="mysqlnd-ms.pooling">
<title>Connection pooling and switching</title>
<para>
The replication and load balancing plugin changes the semantics of a PHP
MySQL connection handle. The existing API of the PHP MySQL extensions
(<link linkend="ref.mysqli">mysqli</link>,
<link linkend="ref.mysql">mysql</link>,
<link linkend="ref.pdo-mysql">PDO_MYSQL</link>) are not changed in
a way that functions are added or removed. But their behaviour
changes when using the plugin. Existing applications do not need to
be adapted to a new API. But they may need to be modified because of
the behaviour changes.
</para>
<para>
The plugin breaks the one-by-one relationship between a
<link linkend="ref.mysqli">mysqli</link>,
<link linkend="ref.mysql">mysql</link>,
<link linkend="ref.pdo-mysql">PDO_MYSQL</link> connection
handle and a MySQL wire connection. If using the plugin a
<link linkend="ref.mysqli">mysqli</link>,
<link linkend="ref.mysql">mysql</link>,
<link linkend="ref.pdo-mysql">PDO_MYSQL</link> connection
handle represents a local pool of connections to the configured
MySQL replication master and the MySQL replication slave servers.
The plugin redirects queries to the master and slave servers.
At some point in time one and the same PHP connection handle
may point to the MySQL master server. Later on, it may point
to one of the slave servers or still the master. Manipulating
and replacing the wire connection referenced by a PHP MySQL
connection handle is not a transparent operation.
</para>
<para>
Every MySQL connection has a state. The state of the connections in
the connection pool of the plugin can differ. Whenever the
plugin switches from one wire connection to another, the current state of
the user connection may change. The applications must be aware of this.
</para>
<para>
The following list shows what the connection state consists of. The list
may not be complete.
</para>
<para>
<itemizedlist>
<listitem>
<simpara>
Transaction status
</simpara>
</listitem>
<listitem>
<simpara>
Temporary tables
</simpara>
</listitem>
<listitem>
<simpara>
Table locks
</simpara>
</listitem>
<listitem>
<simpara>
Session system variables and session user variables
</simpara>
</listitem>
<listitem>
<simpara>
Session system variables and session user variables
</simpara>
</listitem>
<listitem>
<simpara>
Prepared statements
</simpara>
</listitem>
<listitem>
<simpara>
<literal>HANDLER</literal> variables
</simpara>
</listitem>
<listitem>
<simpara>
Locks acquired with <literal>GET_LOCK()</literal>
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Connection switches happen right before queries are run. The plugin does
not switch the current connection until the moment in time when
the next statement is executed.
</para>
<para>
Please, do not miss the MySQL reference manual chapter on
replication features and issues. Some restrictions you hit may not be related
to the PHP plugin but are properties of the MySQL replication system.
</para>
<para>Broadcasted messages</para>
<para>
The plugins philosophy is to align the state of connections in the
pool only if the state is under full control of the plugin, or if it is
necessary for security reasons. Just a few actions that change the
state of the connection fall into this category.
</para>
<para>
List of connection state changing client library calls broadcasted to all
open connections in the connection pool.
</para>
<informaltable>
<tgroup cols="3">
<colspec colwidth="10%"/>
<colspec colwidth="70%"/>
<colspec colwidth="20%"/>
<thead>
<row>
<entry>Library call</entry>
<entry>Notes</entry>
<entry>Version</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<literal>change_user()</literal>
</entry>
<entry>
Called by the <function>mysqli_change_user</function> user API call.
Also triggered upon reuse of a persistent <literal>mysqli</literal>
connection.
</entry>
<entry>Since 1.0.0.</entry>
</row>
<row>
<entry>
<literal>select_db</literal>
</entry>
<entry>
Called by the following user API calls:
<function>mysql_select_db</function>,
<function>mysql_list_tables</function>,
<function>mysql_db_query</function>,
<function>mysql_list_fields</function>,
<function>mysqli_select_db</function>.
</entry>
<entry>Since 1.0.0.</entry>
</row>
<row>
<entry>
<literal>set_charset()</literal>
</entry>
<entry>
Called by the following user API calls:
<function>mysql_set_charset</function>.
<function>mysqli_set_charset</function>.
</entry>
<entry>Since 1.0.0.</entry>
</row>
<row>
<entry>
<literal>set_server_option()</literal>
</entry>
<entry>
Called by the following user API calls:
<function>mysqli_multi_query</function>,
<function>mysqli_real_query</function>,
<function>mysqli_query</function>,
<function>mysql_query</function>.
</entry>
<entry>Since 1.0.0.</entry>
</row>
<row>
<entry>
<literal>set_client_option()</literal>
</entry>
<entry>
Called by the following user API calls:
<function>mysqli_options</function>,
<function>mysqli_ssl_set</function>,
<function>mysqli_connect</function>,
<function>mysql_connect</function>,
<function>mysql_pconnect</function>.
</entry>
<entry>Since 1.0.0.</entry>
</row>
<row>
<entry>
<literal>set_autocommit()</literal>
</entry>
<entry>
Called by the following user API calls:
<function>mysqli_autocommit</function>,
<function>PDO::setAttribute(PDO::ATTR_AUTOCOMMIT)</function>.
</entry>
<entry>Since 1.0.0. PHP &gt;= 5.4.0.</entry>
</row>
<row>
<entry>
<literal>ssl_set()</literal>
</entry>
<entry>
Called by the following user API calls:
<function>mysqli_ssl_set</function>.
</entry>
<entry>Since 1.1.0.</entry>
</row>
</tbody>
</tgroup>
</informaltable>
<para>
If any of the above listed calls is to be executed,
the plugin loops over all currently open master and slave connections.
The loop continues until all servers have been contacted. The loop does
not break, if a server indicates a failure. If possible, the failure will be
propagated to the calling user API function. Depending on the user API
function, which has triggered the underlying library function, users may
be able to detect the failure.
</para>
<para>Broadcasting and lazy connections</para>
<para>
The plugin does not proxy or
<quote>remember</quote> all settings to apply them on connections
opened in the future. This is important to remember, if
using
<link linkend="ini.mysqlnd-ms-plugin-config-v2.lazy_connections">lazy connections</link>.
Lazy connections are connections which are not
opened before the client sends the first connection.
Use of lazy connections is the default plugin action.
</para>
<para>
Settings of the following connection state changing library calls are
recorded to be used when opening a lazy connection to ensure that connection
state of all connections in the connection pool is comparable.
</para>
<informaltable>
<tgroup cols="3">
<colspec colwidth="10%"/>
<colspec colwidth="70%"/>
<colspec colwidth="20%"/>
<thead>
<row>
<entry>Library call</entry>
<entry>Notes</entry>
<entry>Version</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<literal>change_user()</literal>
</entry>
<entry>
User, password and database recorded for future use.
</entry>
<entry>Since 1.1.0.</entry>
</row>
<row>
<entry>
<literal>select_db</literal>
</entry>
<entry>
Database recorded for future use.
</entry>
<entry>Since 1.1.0.</entry>
</row>
<row>
<entry>
<literal>set_charset()</literal>
</entry>
<entry>
Calls <literal>set_client_option(MYSQL_SET_CHARSET_NAME, charset)</literal>
on lazy connection to ensure <literal>charset</literal> will be used
upon opening the lazy connection.
</entry>
<entry>Since 1.1.0.</entry>
</row>
<row>
<entry>
<literal>set_autocommit()</literal>
</entry>
<entry>
Adds <literal>SET AUTOCOMMIT=0|1</literal> to the list of init commands
of a lazy connection using
<literal>set_client_option(MYSQL_INIT_COMMAND, &quot;SET AUTOCOMMIT=...%quot;)</literal>.
</entry>
<entry>Since 1.1.0. PHP &gt;= 5.4.0.</entry>
</row>
</tbody>
</tgroup>
</informaltable>
<para>
Please note that the connection state is not only changed by API calls. Thus,
even if PECL mysqlnd_ms monitors all API calls, the application still needs
to take care. Ultimately, it is in the application developers reposibility
to maintain connection state, if needed.
</para>
</section>
<section xml:id="mysqlnd-ms.transaction">
<title>Transaction handling</title>
<para>
Transaction handling is fundamentally changed.
A SQL transaction is a unit of work run on one database server. The
unit of work consists of one or more SQL statements.
</para>
<para>
By default the plugin is not aware of SQL transactions. The plugin may
switch connections for load balancing at any point in time. Connection
switches may happen in the middle of a transaction. This is against the
nature of a SQL transaction. By default the plugin is not transaction safe.
</para>
<para>
At the time of writing, applications using SQL transactions together with
the plugin must use SQL hints to disable connection switches in the middle
of a SQL transaction. Please, find details in the examples section.
</para>
<para>
The latest version of the <literal>mysqlnd</literal> library, as found in
PHP 5.4.0, allows the plugin to subclass the library C API call
<literal>set_autocommit()</literal> to
detect the status of the <literal>autocommit</literal> mode. The PHP MySQL
extensions either issue a query such as <literal>SET AUTOCOMMIT=0|1</literal>
or use the mysqlnd library call <literal>set_autcommit()</literal> to control
the <literal>autocommit</literal> setting. If an extension makes use of
<literal>set_autocommit()</literal>, the plugin can be made transaction aware.
Transaction awareness cannot be achieved, if using SQL to set the autocommit
mode.
The library function <literal>set_autocommit()</literal> is called by the
<function>mysqli_autocommit</function> and
<function>PDO::setAttribute(PDO::ATTR_AUTOCOMMIT)</function> user API calls.
</para>
<para>
The experimental pluging configuration option
<literal>trx_stickiness=master</literal> can be used to make the plugin
transaction aware if using PHP 5.4.0 or newer. In this mode the plugin stops load
balancing if autocommit gets disabled and directs all statements to the
master until autocommit gets enabled.
</para>
</section>
<section xml:id="mysqlnd-ms.failover">
<title>Failover</title>
<para>
Connection failover handling is left to the user. The application is responsible
for checking return values of the database functions it calls and reacting
to possible errors. If, for example, the plugin recognizes a query as a read-only
query to be sent to the slave servers and the slave server selected by the
plugin is not available, the plugin will raise an error after not executing
the statement.
</para>
<para>
It is up to
the application to handle the error and, if need be, re-issue the query to
trigger selection of another slave server for statement execution. The plugin
will make no attempts to failover automatically because the plugin
cannot ensure that an automatic failover will not change the state of
the connection. For example, the application may have issued a query
which depends on SQL user variables which are bound to a specific connection.
Such a query might return wrong results if the plugin would switch the
connection implicitly as part of automatic failover. To ensure correct
results the application must take care of the failover and rebuild
the required connection state. Therefore, by default, no automatic failover
is done by the plugin.
</para>
<para>
An user who does not change the connection state after opening a connection
may activate automatic master failover.
</para>
<para>
The failover policy is configured in the plugins configuration file by help
of the
<literal><link linkend="ini.mysqlnd-ms-plugin-config-v2.failover">failover</link></literal>
configuration directive.
</para>
</section>
<section xml:id="mysqlnd-ms.loadbalancing">
<title>Load balancing</title>
<para>
Four load balancing strategies are supported to distribute read-only
statements over the configured MySQL slave servers: random, random once,
round robin and user defined via callback. Random picks a random
server whenever a statement is to be executed. Random once picks a
random server once when the first statement is executed and uses
the decision for the rest of the PHP request. Random once is the default,
if nothing else is configured because random once has the lowest impact
on the connection state. Round robin iterates over the list of
configured servers. A user defined callback can be used to implement
any other strategy.
</para>
<para>
The load balancing policy is configured in the plugins configuration
file using the
<link linkend="ini.mysqlnd-ms-plugin-config-v2.filter_random">random</link>,
<link linkend="ini.mysqlnd-ms-plugin-config-v2.filter_roundrobin">roundrobin</link>
and <link linkend="ini.mysqlnd-ms-plugin-config-v2.filter_user">user</link>
filter. Please, see below to learn more about
<link linkend="mysqlnd-ms.filter">filter</link>.
</para>
</section>
<section xml:id="mysqlnd-ms.rwsplit">
<title>Read-write splitting</title>
<para>
The plugin runs read-only statements on the configured MySQL slaves and
all other queries on the MySQL master. Statements are
considered read-only if they either start with <literal>SELECT</literal>,
the SQL hint <literal>/*ms=slave*/</literal> or a slave had been chosen for
running the previous query and the query starts with the SQL hint
<literal>/*ms=last_used*/</literal>. In all other cases the query will
be send to the MySQL replication master server.
</para>
<para>
SQL hints are a special kind of standard compliant SQL comments. The plugin
does check every statement for certain SQL hints. The SQL hints are described
together with the <link linkend="mysqlnd-ms.constants">constants</link>
exported by the extension. Other systems
involved in the statement processing, such as the MySQL server, SQL firewalls
or SQL proxies are unaffected by the SQL hints because those systems are
supposed to ignore SQL comments.
</para>
<para>
The built-in read-write splitter can be replaced by a user-defined one, see also
<function>mysqlnd_ms_set_user_pick_server</function>.
</para>
<para>
A user-defined read-write splitter can ask the built-in logic to make
a proposal where to sent a statement by invoking
<function>mysqlnd_ms_is_select</function>.
</para>
<note>
<para>
The built-in read-write splitter is not aware of multi-statements.
Multi-statements are seen as one statement. The splitter will check the
beginning of the statement to decide where to run the statement. If, for example,
a multi-statement begins with
<literal>SELECT 1 FROM DUAL; INSERT INTO test(id) VALUES (1); ...</literal>
the plugin will run it on a slave although the statement is not read-only.
</para>
</note>
</section>
<section xml:id="mysqlnd-ms.filter">
<title>Filter</title>
<note>
<para>
The below description applies to PECL/mysqlnd_ms &gt;= 1.1.0-beta.
It is not valid for earlier versions.
</para>
</note>
<para>
PECL/mysqlnd 1.1.0-beta introduces the concept of
<link linkend="mysqlnd-ms.plugin-ini-json">filters</link>.
Any PHP application using any kind of MySQL replication cluster first needs to identify
a group of servers in the cluster which could execute a given statement before
the statement is executed on one of the candidates. In other words: a given
list of servers has to be filtered until one server is left.
</para>
<para>
The process of filtering may include the use one or more filters. Filters can be
chained. They are executed in the order of their appearance in the plugins
configuration file. The concept of chained filters can be compared to using
pipes to connect command line utitilies on an operating system command shell: an input stream
is passed to a processor, filtered and transferred to be output.
Then the output is passed as input to the next command which is connected
to the previous using the pipe operator.
</para>
<para>
The following filters are available with version 1.1.0-beta.
<itemizedlist>
<listitem>
<simpara>
Load balancing filter:
<link linkend="ini.mysqlnd-ms-plugin-config-v2.filters"><literal>random</literal></link>,
<link linkend="ini.mysqlnd-ms-plugin-config-v2.filters"><literal>roundrobin</literal></link>.
</simpara>
</listitem>
<listitem>
<simpara>
Selection filter:
<link linkend="ini.mysqlnd-ms-plugin-config-v2.filters"><literal>user</literal></link>.
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
The <literal>random</literal> and <literal>roundrobin</literal>
filters replace the
<link linkend="ini.mysqlnd-ms-plugin-config.pick"><literal>pick[]</literal></link>
configuration directive found in earlier versions. The <literal>random</literal>
filter implementes the random and random once load balancing policies.
Round robin load balancing can be configured through the
<literal>roundrobin</literal> filter. Setting a user defined callbacks for server
selection is possible with the <literal>user</literal> filter. The
<link linkend="function.mysqlnd-ms-set-user-pick-server">
<function>mysqlnd_ms_set_user_pick_server</function></link> function previously
used for this has been removed.
</para>
<para>
Filters can accept parameters to change their behaviour.
The <literal>random</literal> filter accepts an optional
<literal>sticky</literal> parameter. If set to true, the filter chnages
load balancing from random to random once. Random picks a random server
every time a statement is to be executed. Random once picks a random
server when the first statement is to be executed and uses the same
server for the rest of the PHP request.
</para>
<para>
One of the biggest strength of the filter concept is the possibility to
chain filters. This strength coes not become immediately visible with the
filters provided by version 1.1.0-beta because all of the above filters
are supposed to output no more than one server. If a filter reduces
the list of candidates for running a statement to only one server, it
makes little sense to use that one server as input for another filter for
further reduction of the list of candidates.
</para>
<para>
A filter sequence of not much value:
<itemizedlist>
<listitem>
<simpara>
Statement to be excuted: <literal>SELECT 1 FROM DUAL</literal>. Passed to all filters.
</simpara>
</listitem>
<listitem>
<simpara>
All configured nodes are passed as input to the first filter.
Master nodes: <literal>master_0</literal>.
Slave nodes:<literal>slave_0</literal>, <literal>slave_1</literal>
</simpara>
</listitem>
<listitem>
<simpara>
Filter: <literal>random</literal>, argument <literal>sticky=1</literal>.
Picks a random slave once to be used for the rest of the PHP request.
Output: <literal>slave_0</literal>.
</simpara>
</listitem>
<listitem>
<simpara>
Output of <literal>slave_0</literal> and the statement to be executed
is passed as input to the next filter. Here: <literal>roundrobin</literal>,
server list passed to filter is: <literal>slave_0</literal>.
</simpara>
</listitem>
<listitem>
<simpara>
Filter: <literal>roundrobin</literal>. Server list consists of
one server only, round robin will always return the same server.
</simpara>
</listitem>
</itemizedlist>
If trying to use such a filter sequence,
the plugin may emit a warning like <literal>(mysqlnd_ms) Error while creating
filter '%s' . Non-multi filter '%s' already created. Stopping in %s on
line %d</literal>. Furthermode an appropriate error on the connection handle
may be set.
</para>
<para>
In future versions there may be filters which return more than one candidate
for statement execution. For example, there may be a <literal>table</literal>
filter to support MySQL replication filtering. MySQL replication filter allow
you to define rules which database or table is to be replicated to which
node of a replication cluster. Assume your replication cluster
consists of four slaves (<literal>slave_0</literal>, <literal>slave_1</literal>,
<literal>slave_2</literal>, <literal>slave_3</literal>) two of which replicate a database named
<literal>sales</literal> (<literal>slave_0</literal>, <literal>slave_1</literal>).
If the application queries the database.<literal>slaves</literal> the
hypothetical <literal>table</literal> filter reduces the list of possible
servers to <literal>slave_0</literal>, <literal>slave_1</literal>. Because
the output and list of candidates consists of more than one server, it is
necessary and possible to further filter the candidate list, for example, using
a load balancing filter to identiy a server for statement execution.
</para>
<para>
A hypothetical filter sequence, assuming the existance of a <literal>table</literal>
filter to support MySQL replication filtering (client-side partitioning).
<itemizedlist>
<listitem>
<simpara>
Statement to be excuted: <literal>SELECT col FROM sales.reports</literal>. Passed to all filters.
</simpara>
</listitem>
<listitem>
<simpara>
All configured nodes are passed as input to the first filter.
Master nodes: <literal>master_0</literal>.
Slave nodes: <literal>slave_0</literal>, <literal>slave_1</literal>,
<literal>slave_2</literal>, <literal>slave_3</literal>
</simpara>
</listitem>
<listitem>
<simpara>
Filter: <literal>table</literal>, rules set for database <literal>sales</literal>.
Output: <literal>slave_0</literal>, <literal>slave_1</literal>.
</simpara>
</listitem>
<listitem>
<simpara>
Output of <literal>slave_0</literal>, <literal>slave_1</literal>
and the statement to be executed
is passed as input to the next filter, which is <literal>roundrobin</literal>.
</simpara>
</listitem>
<listitem>
<simpara>
Filter: <literal>roundrobin</literal>. Server list consists of
two servers. Round robin selectes <literal>slave_0</literal>.
Upon subsequent execution, if the same server list is given as
input, the filter will return <literal>slave_1</literal> followed
by <literal>slave_0</literal>, <literal>slave_1</literal>,
<literal>slave_0</literal> and so forth.
</simpara>
</listitem>
</itemizedlist>
<note>
<para>
The example aims to illustrate the strength of the filter
concept. It does not make any promises on future features.
</para>
</note>
</para>
</section>
</chapter>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->