php-doc-en/reference/mongo/connecting.xml

774 lines
26 KiB
XML

<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision$ -->
<chapter xml:id="mongo.connecting" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<title>Connecting</title>
<para>
Connecting to MongoDB can be as easy as <literal>new MongoClient</literal>,
but there are many additional options and configurations. The documentation
for <function>MongoClient::__construct</function> covers all of the API
options, but this page gives some more details and advice for practical use
cases.
</para>
<section xml:id="mongo.connecting.ssl">
<title>Connecting over SSL</title>
<para>
The driver supports connecting to <link xlink:href="&url.mongodb.docs.configure-ssl;">MongoDB over SSL</link>
and can optionally use <link linkend="context.ssl">SSL Stream Context</link> options to provide more details,
such as verifying certificates against specific certificate chain, or authenticate to
<link xlink:href="&url.mongodb.docs.configure-x509;">MongoDB using X509 certificates</link>.
</para>
<example xml:id="mongo.connecting.context.ssl">
<title>Connect to MongoDB Instance with SSL Encryption</title>
<programlisting role="php">
<![CDATA[
<?php
$mc = new MongoClient("mongodb://server1", array("ssl" => true));
?>
]]>
</programlisting>
</example>
<example xml:id="mongo.connecting.context.ssl.verify">
<title>Connect to MongoDB Instance with SSL Encryption, verifying it is who we think it is</title>
<programlisting role="php">
<![CDATA[
<?php
$SSL_DIR = "/vagrant/certs/";
$SSL_FILE = "CA_Root_Certificate.pem";
$ctx = stream_context_create(array(
"ssl" => array(
/* Certificate Authority the remote server certificate must be signed by */
"cafile" => $SSL_DIR . "/" . $SSL_FILE,
/* Disable self signed certificates */
"allow_self_signed" => false,
/* Verify the peer certificate against our provided Certificate Authority root certificate */
"verify_peer" => true, /* Default to false pre PHP 5.6 */
/* Verify the peer name (e.g. hostname validation) */
/* Will use the hostname used to connec to the node */
"verify_peer_name" => true,
/* Verify the server certificate has not expired */
"verify_expiry" => true, /* Only available in the MongoDB PHP Driver */
),
);
$mc = new MongoClient(
"mongodb://server1",
array("ssl" => true),
array("context" => $ctx)
);
?>
]]>
</programlisting>
<note>
<para>
The <literal>"verify_peer_name"</literal> is new in PHP 5.6.0. The
MongoDB driver as of 1.6.5 however has backported this feature into the
driver itself, so it works with PHP 5.3 and 5.4 too
</para>
</note>
</example>
<example xml:id="mongo.connecting.context.ssl.certificate">
<title>Connect to MongoDB Instance that Requires Client Certificates</title>
<programlisting role="php">
<![CDATA[
<?php
$SSL_DIR = "/vagrant/certs/";
$SSL_FILE = "CA_Root_Certificate.pem";
MYCERT = "/vagrant/certs/ca-signed-client.pem";
$ctx = stream_context_create(array(
"ssl" => array(
"local_cert" => $MYCERT,
/* If the certificate we are providing was passphrase encoded, we need to set it here */
"passphrase" => "My Passphrase for the local_cert",
/* Optionally verify the server is who he says he is */
"cafile" => $SSL_DIR . "/" . $SSL_FILE,
"allow_self_signed" => false,
"verify_peer" => true,
"verify_peer_name" => true,
"verify_expiry" => true,
),
));
$mc = new MongoClient(
"mongodb://server1/?ssl=true",
array(),
array("context" => $ctx)
);
?>
]]>
</programlisting>
</example>
<example xml:id="mongo.connecting.authenticate.ssl.x509">
<title>Authenticating with X.509 certificates</title>
<para>
The username is the <literal>certificate subject</literal> from the X509, which can be extracted like this:
</para>
<programlisting role="shell">
<![CDATA[
openssl x509 -in /vagrant/certs/ca-signed-client.pem -inform PEM -subject -nameopt RFC2253
]]>
</programlisting>
<programlisting role="php">
<![CDATA[
<?php
$ctx = stream_context_create( array(
"ssl" => array(
"local_cert" => "/vagrant/certs/ca-signed-client.pem",
)
) );
$mc = new MongoClient(
'mongodb://username@server1/?authSource=$external&authMechanism=MONGODB-X509&ssl=true',
array(),
array("context" => $ctx)
);
?>
]]>
</programlisting>
<para>
Where <literal>username</literal> is the certificate subject.
</para>
</example>
<simplesect role="changelog">
&reftitle.changelog;
<informaltable>
<tgroup cols="2">
<thead>
<row>
<entry>&Version;</entry>
<entry>&Description;</entry>
</row>
</thead>
<tbody>
<row>
<entry>1.5.0</entry>
<entry>
Added support for X509 authentication.
</entry>
</row>
<row>
<entry>1.4.0</entry>
<entry>
Added support for connecting to SSL enabled MongoDB.
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</simplesect>
</section>
<section xml:id="mongo.connecting.auth">
<title>Authentication</title>
<para>
If MongoDB is started with the <literal>--auth</literal> or
<literal>--keyFile</literal> options, you must authenticate before you can do
any operations with the driver. You may authenticate a connection by
specifying the username and password in either the connection URI or the
<literal>"username"</literal> and <literal>"password"</literal> options for
<function>MongoClient::__construct</function>.
</para>
<example xml:id="mongo.connecting.auth-example">
<title>Authenticating against the "admin" database</title>
<programlisting role="php">
<![CDATA[
<?php
// Specifying the username and password in the connection URI (preferred)
$m = new MongoClient("mongodb://${username}:${password}@localhost");
// Specifying the username and password via the options array (alternative)
$m = new MongoClient("mongodb://localhost", array("username" => $username, "password" => $password));
?>
]]>
</programlisting>
</example>
<para>
By default, the driver will authenticate against the <literal>admin</literal>
database. You may authenticate against a different database by specifying it
in either the connection URI or the <literal>"db"</literal> option for
<function>MongoClient::__construct</function>.
</para>
<example xml:id="mongo.connecting.auth-db-example">
<title>Authenticating against normal databases</title>
<programlisting role="php">
<![CDATA[
<?php
// Specifying the authentication database in the connection URI (preferred)
$m = new MongoClient("mongodb://${username}:${password}@localhost/myDatabase");
// Specifying the authentication database via the options array (alternative)
$m = new MongoClient("mongodb://${username}:${password}@localhost", array("db" => "myDatabase"));
?>
]]>
</programlisting>
</example>
<para>
If your connection is dropped, the driver will automatically attempt to
reconnect and reauthenticate you.
</para>
</section>
<section xml:id="mongo.connecting.rs">
<title>Replica Sets</title>
<para>
To connect to a replica set, specify one or more members of the set and use
the <literal>"replicaSet"</literal> option. Multiple servers may be delimited
by a comma.
</para>
<example xml:id="mongo.connecting.rs-example">
<title>ReplicaSet seed list</title>
<programlisting role="php">
<![CDATA[
<?php
// Using multiple servers as the seed list (prefered)
$m = new MongoClient("mongodb://rs1.example.com:27017,rs2.example.com:27017/?replicaSet=myReplSetName");
// Using one server as the seed list
$m = new MongoClient("mongodb://rs1.example.com:27017", array("replicaSet" => "myReplSetName"));
// Using multiple servers as the seed list
$m = new MongoClient("mongodb://rs1.example.com:27017,rs2.example.com:27017", array("replicaSet" => "myReplSetName"));
?>
]]>
</programlisting>
</example>
<para>
The PHP driver will query the database server(s) listed to determine the
primary. So long as it can connect to at least one host listed and find a
primary, the connection will succeed. If it cannot make a connection to any
servers listed or cannot find a primary, a
<classname>MongoConnectionException</classname> will be thrown.
</para>
<tip>
<para>
You should always provide a seed list with more than one member of the
ReplicaSet. For highest availability you should seed with at least one
server from each of your datacenters.
</para>
</tip>
<warning>
<para>
The host names that you specify in the seed list <emphasis>must</emphasis>
match the host names in the replica set configuration. This is because the
driver only uses the host names in the replica set configuration to create
the hash for its persistent connections.
</para>
<para>
For example, if an IP address is used in the seed list and the replica set
is configured with host names, the driver will discard any seed list
connection(s) that differ from the canonical host names reported by the
replica set. Effectively, these non-canonical seed list connections will be
recreated on each request, greatly reducing the benefit of using persistent
connections.
</para>
</warning>
<warning>
<para>
The driver does <emphasis>not</emphasis> support connecting to different
replica sets with the same name. This extends beyond one script so make
sure to have separate names for each of your replica sets. That also means
that you can <emphasis>not</emphasis> do this:
</para>
<example xml:id="mongo.connecting.rs-example-wrong-replset">
<title>Wrong replica set name duplication</title>
<programlisting role="php">
<![CDATA[
<?php
$m = new MongoClient("mongodb://devserver1,devserver2,devserver3/?replicaSet=repset");
$m = new MongoClient("mongodb://prodserver1,prodserver2,prodserver3/?replicaSet=repset");
?>
]]>
</programlisting>
</example>
<para>
Instead, you need to have two different names for your replica sets:
</para>
<example xml:id="mongo.connecting.rs-example-correct-replset">
<title>Correct use of two replica set names</title>
<programlisting role="php">
<![CDATA[
<?php
$m = new MongoClient("mongodb://devserver1,devserver2,devserver3/?replicaSet=devset");
$m = new MongoClient("mongodb://prodserver1,prodserver2,prodserver3/?replicaSet=prodset");
?>
]]>
</programlisting>
</example>
</warning>
<para>
If the primary becomes unavailable, an election will take place and a
secondary will be promoted to the role of primary (unless a majority vote
cannot be established). During this time
(<link xlink:href="&url.mongodb.replica.failover;">20-60 seconds</link>), the
connection will not be able to perform any write operations and attempts to
do so will result in an exception. Connections to secondaries will still be
able to perform reads.
</para>
<note>
<para>
The default <link linkend="mongo.readpreferences">Read Preference</link>
is to only read from the primary. During the election process there is no
primary, and all read will therefore fail.
</para>
<para>
It is recommended to use
<constant>MongoClient::RP_PRIMARY_PREFERRED</constant> Read Preference for
applications that require high availability for reads, as reads will only
be executed on the secondaries when there simply isn't a primary
available.
</para>
</note>
<para>
Once a primary is elected, attempting to perform a read or write will allow
the driver to detect the new primary. The driver will make this its primary
database connection and continue operating normally.
</para>
<para>
The health and state of a secondary is checked every 5 seconds
(configurable with
<link linkend="ini.mongo.ping-interval">mongo.ping_interval</link>)
or when the next operation occurs after 5 seconds. It will also recheck
the configuration when the driver has a problem reaching a server.
</para>
<para>
Replica set failovers are checked every 60 seconds (configurable with
<link linkend="ini.mongo.is-master-interval">mongo.is_master_interval</link>),
and whenever a write operation fails when using acknowledged writes.
</para>
<caution>
<para>
Secondaries may be behind the primary in operations, so
your application must be able to handle out-of-date data when using
Read Preferences other then <constant>MongoClient::RP_PRIMARY</constant>.
</para>
</caution>
<para>
For more information on replica sets, see the
<link xlink:href="&url.mongodb.replica;">core documentation</link>.
</para>
<simplesect role="seealso">
&reftitle.seealso;
<simplelist>
<member><xref linkend="mongo.readpreferences" /></member>
<member><xref linkend="mongo.writeconcerns" /></member>
</simplelist>
</simplesect>
<simplesect role="changelog">
&reftitle.changelog;
<informaltable>
<tgroup cols="2">
<thead>
<row>
<entry>&Version;</entry>
<entry>&Description;</entry>
</row>
</thead>
<tbody>
<row>
<entry>1.0.9</entry>
<entry>
Added support for connecting to ReplicaSet and automatic failover.
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</simplesect>
</section>
<section xml:id="mongo.connecting.mongos">
<title>Sharding</title>
<para>
To connect to a shard cluster, specify the address of one or more
<literal>mongos</literal> instances in the connection string. Multiple
servers may be delimited by a comma.
</para>
<example xml:id="mongo.connecting.mongos-example">
<programlisting role="php">
<![CDATA[
<?php
// Using one server as the seed list
$m = new MongoClient("mongodb://mongos1.example.com:27017");
// Using multiple servers as the seed list
$m = new MongoClient("mongodb://mongos1.example.com:27017,mongos2.example.com:27017");
?>
]]>
</programlisting>
</example>
<para>
Regardless of whether each shard is a stand-alone <literal>mongod</literal>
server or a full replica set, the driver's connection process is the same.
All database communication will be routed through <literal>mongos</literal>.
</para>
<para>
For more information on sharding with MongoDB, see the
<link xlink:href="&url.mongodb.docs.sharding;">sharding documentation</link>.
</para>
</section>
<section xml:id="mongo.connecting.uds">
<title>Domain Socket Support</title>
<para>
MongoDB has built-in support for via Unix Domain Sockets and will open the
socket on startup, by default located in <filename>/tmp/mongodb-&lt;port&gt;.sock.</filename>.
</para>
<para>
To connect to the socket file, specify the path in your MongoDB connection
string:
</para>
<example xml:id="mongo.connecting.uds-example">
<programlisting role="php">
<![CDATA[
<?php
$m = new MongoClient("mongodb:///tmp/mongo-27017.sock");
?>
]]>
</programlisting>
</example>
<para>
If you would like to authenticate against a database (as described above)
with a socket file, you must specify a port of <literal>0</literal> so that
the connection string parser can detect the end of the socket path.
Alternatively, you can use the constructor options.
</para>
<example xml:id="mongo.connecting.uds-auth-example">
<programlisting role="php">
<![CDATA[
<?php
$m = new MongoClient("mongodb://username:password@/tmp/mongo-27017.sock:0/foo");
?>
]]>
</programlisting>
</example>
<simplesect role="changelog">
&reftitle.changelog;
<informaltable>
<tgroup cols="2">
<thead>
<row>
<entry>&Version;</entry>
<entry>&Description;</entry>
</row>
</thead>
<tbody>
<row>
<entry>1.0.9</entry>
<entry>
Added support for Unix Domain Sockets.
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</simplesect>
</section>
<section xml:id="mongo.connecting.persistent">
<title>Persistent Connections (version 1.3.0+)</title>
<para>
All versions of the driver since 1.3.0 utilize persistent connections to
minimize the number of connections made to each database server. These
connections are saved by the PHP worker process and may be reused between
multiple requests.
</para>
<para>
Before connecting to a database server, the driver will create a hash for the
connection based on its host, port, replica set name (if any), any
authentication credentials (e.g. username, password, database), and the
process ID. If a connection already exists for that hash, it will be used in
lieu of creating a new connection associated with that hash.
<function>MongoClient::getConnections</function> may be used to retrieve info
about each persistent connection. Consider the following program:
</para>
<example xml:id="mongo.connecting.persistent-example">
<programlisting role="php">
<![CDATA[
<?php
$m1 = new MongoClient('mongodb://localhost');
$m2 = new MongoClient('mongodb://localhost');
$m3 = new MongoClient('mongodb://user:pw@localhost');
$m4 = new MongoClient('mongodb://127.0.0.1');
$m5 = new MongoClient('mongodb://rs1.local:30017,rs2.local:30018/?replicaSet=rs');
$m6 = new MongoClient('mongodb://sharding.local:40017');
foreach (MongoClient::getConnections() as $conn) {
echo $conn['hash'], "\n";
}
?>
]]>
</programlisting>
&example.outputs.similar;
<screen>
<![CDATA[
localhost:27017;-;X;15487
localhost:27017;-;admin/user/c56c…8bbc;15487
127.0.0.1:27017;-;X;15487
rs1.local:30017;rs;X;15487
rs2.local:30018;rs;X;15487
sharding.local:40017;-;X;15487
]]>
</screen>
</example>
<para>
In this example <literal>$m1</literal> and <literal>$m2</literal> have the
same hash and share a persistent connection. Connections for each other
MongoClient instance hash to unique values and use their own sockets. Note
that "localhost" and "127.0.0.1" do not share the same hash; DNS resolution
is not taken into account.
</para>
</section>
<section xml:id="mongo.connecting.pools">
<title>Connection Pooling (version 1.2.0-1.2.12 *only*)</title>
<note>
<para>
This section is no longer relevant as of the 1.3.0 release of the driver
and only serves as a historical information on how the pooling used to
work.
</para>
<para>
The latest versions of the driver have no concept of pooling anymore and
will maintain only one connection per process, for each connection type
(ReplicaSet/standalone/mongos), for each credentials combination.
</para>
</note>
<para>
Creating connections is one of the most heavyweight things that the driver
does. It can take hundreds of milliseconds to set up a connection correctly,
even on a fast network. Thus, the driver tries to minimize the number of new
connections created by reusing connections from a pool.
</para>
<para>
When a user creates a new instance of <classname>MongoClient</classname>, all
necessary connections will be taken from their pools (replica sets may
require multiple connections, one for each member of the set). When the
<classname>MongoClient</classname> instance goes out of scope, the
connections will be returned to the pool. When the PHP process exits, all
connections in the pools will be closed.
</para>
<section>
<title>"Why do I have so many open connections?"</title>
<para>
Connection pools can generate a large number of connections. This is
expected and, using a little arithmetic, you can figure out how many
connections will be created. There are three factors in determining the
total number of connections:
</para>
<itemizedlist>
<listitem>
<para>
<literal>
connections_per_pool
</literal>
</para>
<para>
Each connection pool will create, by default, an unlimited number of
connections. One might assume that this is a problem: if it can create an
unlimited number of connections, couldn't it create thousands and the
server would run out of file descriptors? In practice, this is unlikely,
as unused connections are returned to the pool to be used later, so future
connections will use the same connection instead of creating a new one.
Unless you create thousands of connections at once without letting any go
out of scope, the number of connections open should stay at a reasonable
number.
</para>
<para>
You can see how many connections you have in a pool using the
<function>MongoPool::info</function> function. Add up the "in use" and
"in pool" fields for a given server. That is the total number of
connections for that pool.
</para>
</listitem>
<listitem>
<para>
<literal>
pools_per_process
</literal>
</para>
<para>
Each MongoDB server address you're connecting to gets its own connection
pool. For example, if your local hostname is "example.net", connecting
to "example.net:27017", "localhost:27017", and "/tmp/mongodb-27017.sock"
will create three connection pools. You can see how many connection pools
you have open using <function>MongoPool::info</function>.
</para>
</listitem>
<listitem>
<para>
<literal>
processes
</literal>
</para>
<para>
Each PHP process has a separate set of pools. PHP-FPM and Apache
generally create between 6 and a couple of dozen PHP worker children.
Check your settings to see what the max number of PHP processes is that
can be spawned.
</para>
<para>
If you are using PHP-FPM, estimating the number of connections can be
tricky because it will spawn more PHP-FPM workers under heavy load. To be
on the safe side, look at the <literal>max_children</literal> parameter or
add up <literal>spare_servers</literal> + <literal>start_servers</literal>
(choose whichever number is higher). That will indicate how many PHP
processes (i.e. sets of pools) you should plan for.
</para>
</listitem>
</itemizedlist>
<para>
The three variables above can be multiplied together to give the max
number of connections expected:
<literal>connections_per_pool</literal> *
<literal>pools_per_process</literal> *
<literal>processes</literal>. Note that
<literal>connections_per_pool</literal> can be different for different
pools, so <literal>connections_per_pool</literal> should be the max.
</para>
<para>
For example, suppose we're getting 30 connections per pool, 10 pools per PHP
process, and 128 PHP processes. Then we can expect 38400 connections from
this machine. Thus, we should set this machine's file descriptor limit to
be high enough to handle all of these connections or it may run out of file
descriptors.
</para>
<para>
See <classname>MongoPool</classname> for more information on connection
pooling.
</para>
</section>
</section>
<section xml:id="mongo.connecting.persistent.manual">
<title>Manually Persistent Connections (version up to 1.1.4 *only*)</title>
<note>
<para>
This section is not relevant for 1.2.0+. In 1.2.0+, connections are always
persistent and managed automatically by the driver. If you are using a
1.2.x release (but not 1.3.x or later), see
<classname>MongoPool</classname> for more information on pooling.
</para>
</note>
<para>
Creating new connection to the database is very slow. To minimize the number
of connections that you need to make, you can use persistent connections. A
persistent connection is saved by PHP, so you can use the same connection for
multiple requests.
</para>
<para>
For example, this simple program connects to the database 1000 times:
</para>
<example xml:id="mongo.connecting.no-manual-persist-example">
<programlisting role="php">
<![CDATA[
<?php
for ($i=0; $i<1000; $i++) {
$m = new MongoClient();
}
?>
]]>
</programlisting>
</example>
<para>
It takes approximately 18 seconds to execute. If we change it to use a
persistent connection:
</para>
<example xml:id="mongo.connecting.manual-persist-example">
<programlisting role="php">
<![CDATA[
<?php
for ($i=0; $i<1000; $i++) {
$m = new MongoClient("localhost:27017", array("persist" => "x"));
}
?>
]]>
</programlisting>
</example>
<para>
...it takes less than .02 seconds to execute, as it only makes one database
connection.
</para>
<para>
Persistent connections need an identifier string (which is "x" in the above
example) to uniquely identify them. For a persistent connection to be used,
the hostname, port, persist string, and authentication credentials (username,
password and database, if given) must match an existing persistent
connection. Otherwise, a new connection will be created with this identifying
information.
</para>
<para>
Persistent connections are <emphasis>highly recommended</emphasis> and should
always be used in production unless there is a compelling reason not to.
Most of the reasons that they are not recommended for relational databases
are irrelevant to MongoDB.
</para>
</section>
</chapter>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->