Driver Architecture and InternalsExplains the driver architecure, and special featuresArchitectureArchitecture Overview
This section explains how all the different parts of the driver fit
together. From the different language runtimes, through the extension and
to the PHP libraries on top. This new architecture has replaced the old
mongo extension. We refer to the new one
as the mongodb extension.
Architecture Diagram
At the top of this stack sits a pure
PHP library, which we will
distribute as a Composer package. This library will provide an API similar
to what users have come to expect from the old mongo driver (e.g. CRUD methods,
database and collection objects, command helpers) and we expect it to be a
common dependency for most applications built with MongoDB. This library
will also implement common
specifications, in the
interest of improving API consistency across all of the
drivers maintained by
MongoDB (and hopefully some community drivers, too).
Sitting below that library we have the lower level driver.
This extension will effectively form the glue between PHP and our
system libraries (libmongoc and
libbson). This extension
will expose an identical public API for the most essential and
performance-sensitive functionality:
Connection managementBSON encoding and decodingObject document serialization (to support ODM libraries)Executing commands and write operationsHandling queries and cursors
By decoupling the driver internals and a high-level API into an extension and
PHP libraries, respectively, we hope to reduce our maintainence burden and
allow for faster iteration on new features. As a welcome side effect, this
also makes it easier for anyone to contribute to the driver. Additionally,
an identical public API will make it that much easier to port an
application across PHP runtimes, whether the application uses the low-level
driver directly or a higher-level PHP library.
GridFS is a great example
of why we chose this direction.
Although we implemented GridFS in C for our old mongo driver, it is actually
quite a high-level specification. Its API is just an abstraction for
accessing two collections: files (i.e. metadata) and chunks (i.e. blocks of
data). Likewise, all of the syntactic sugar found in the old mongo driver,
such as processing uploaded files or exposing GridFS files as PHP streams,
can be implemented in pure PHP. Provided we have performant methods for
reading from and writing to GridFS' collections – and thanks to our low
level extensions, we will – shifting this API to PHP is win-win.
Earlier I mentioned that we expect the PHP library to be a common
dependency for most applications, but not
all. Some users may prefer to stick to the no-frills
API offered by the extensions, or create their own high-level abstraction
(akin to Doctrine MongoDB for
the old mongo driver). Future libraries could include a PHP library geared
for MongoDB administration, which provides an API for various user
management and ops commands. The next major version of
Doctrine MongoDB ODM will
likely also sit directly atop the extensions.
While we will continue to maintain and support the old mongo driver and its
users for the foreseeable future, we invite everyone to use the
next-generation driver and consider it for any new projects going forward.
You can find all of the essential components across GitHub and JIRA:
Driver Source Code and JIRA LocationsProjectGitHubJIRAPHP Librarymongodb/mongo-php-libraryPHPLIBPHP 5 and PHP 7 Driver (phongo)mongodb/mongo-php-driverPHPC
The existing PHP project in JIRA
will remain open for reporting bugs against the old mongo driver, but we
would ask that you use the new projects above for anything pertaining to
our next-generation drivers.
ConnectionsConnection handling and persistence
&mongodb.note.forking;
Connection and topology persistence (PHP version since 1.2.0)
All versions of the driver since 1.2.0 persist the
libmongoc client object in
the PHP worker process, which allows it to re-use database connections,
authentication states, and topology information across
multiple requests.
When MongoDB\Driver\Manager::__construct is
invoked, a hash is created from its arguments (i.e. URI string and array
options). The driver will attempt to find a previously persisted
libmongoc client object for
that hash. If an existing client cannot be found for the hash, a new client
will be created (and persisted for future use).
Each client contains its own database connections and a view of the server
topology (e.g. standalone, replica set, shard cluster). By persisting the
client between PHP requests, the driver is able to re-use established
database connections and remove the need for
discovering the server topology
on each request.
Consider the following example:
'myReplicaSet']),
];
foreach ($managers as $manager) {
$manager->executeCommand('test', new MongoDB\Driver\Command(['ping' => 1]));
}
?>
]]>
The first two Manager objects will share the same
libmongoc client since
their constructor arguments are identical. The third and fourth objects will
each use their own client. In total, three clients will be created and the
PHP worker executing this script will open two connections to
127.0.0.1 and one connection to each of
rs1.example.com and rs2.example.com.
If the driver discovers additional members of the replica set after issuing
isMaster commands, it will open additional connections to
those servers as well.
If the same worker executes the script again in a second request, the three
clients will be re-used and no new connections should be made. Depending on
how long ago the previous request was served, the driver may need to issue
additional isMaster commands to update its view of the
topologies.
Socket persistence (PHP versions before 1.2.0)
Versions of the PHP driver before 1.2.0 utilize PHP's Streams API for
database connections, using an API within
libmongoc to designate
custom handlers for socket communication; however, a new libmongoc client is
created for each MongoDB\Driver\Manager. As a result,
the driver persists individual database connections but not authentication
state or topology information. This means that the driver needs to issue
commands at the start of each request to authenticate and
discover the server topology.
Database connections are persisted by a hash derived from the server's
host, port, and the URI string used to construct the
MongoDB\Driver\Manager. The constructor's array
options are not included in this hash.
Versions of the driver >= 1.1.8 and < 1.2.0 do not persist sockets
for SSL connections. See
PHPC-720 for
additional information.
Despite its shortcomings with persisting SSL connections when and topology
information, this version of the driver supports all
SSL context options since it uses
PHP's Streams API.
Persisting DataSerialization and deserialization of PHP variables into MongoDB
This document discusses the methods how compound structures (documents,
arrays, objects) are persisted through the drivers. And how they are brought
back into PHP land.
Serialization to BSONArrays
If an array is a packed array — i.e. the keys start
at 0 and are sequential without gaps: BSON array.
If the array is not packed — i.e. having associative (string) keys, the
keys don't start at 0, or when there are gaps:: BSON
object
A top-level (root) document, always serializes as a
BSON document.
Examples
These serialize as a BSON array:
[ 8, 5, 2, 3 ]
[ 0 => 4, 1 => 9 ] => [ 4, 9 ]
]]>
These serialize as a BSON document:
1, 2 => 8, 3 => 12 ] => { "0" : 1, "2" : 8, "3" : 12 }
[ "foo" => 42 ] => { "foo" : 42 }
[ 1 => 9, 0 => 10 ] => { "1" : 9, "0" : 10 }
]]>
Note that the five examples are extracts of a full
document, and represent only one value inside a
document.
Objects
If an object is of the stdClass class, serialize
as a BSON document.
If an object is a supported class that implements
MongoDB\BSON\Type, then use the BSON
serialization logic for that specific type.
MongoDB\BSON\Type instances (excluding
MongoDB\BSON\Serializable may only be
serialized as a document field value. Attempting to serialize such an
object as a root document will throw a
MongoDB\Driver\Exception\UnexpectedValueException
If an object is of an unknown class implementing the
MongoDB\BSON\Type interface, then throw a
MongoDB\Driver\Exception\UnexpectedValueException
If an object is of any other class, without implementing any special
interface, serialize as a BSON document. Keep only
public properties, and ignore
protected and private
properties.
If an object is of a class that implements the
MongoDB\BSON\Serializable interface, call
MongoDB\BSON\Serializable::bsonSerialize and use
the returned array or stdClass to serialize as a
BSON document or array. The BSON type will be determined by the following:
Root documents must be serialized as a BSON
document.
MongoDB\BSON\Persistable objects must be
serialized as a BSON document.
If MongoDB\BSON\Serializable::bsonSerialize
returns a packed array, serialize as a BSON array.
If MongoDB\BSON\Serializable::bsonSerialize
returns a non-packed array or stdClass,
serialize as a BSON document.
If MongoDB\BSON\Serializable::bsonSerialize
did not return an array or stdClass, throw an
MongoDB\Driver\Exception\UnexpectedValueException
exception.
If an object is of a class that implements the
MongoDB\BSON\Persistable interface (which
implies MongoDB\BSON\Serializable), obtain
the properties in a similar way as in the previous paragraphs, but
also add an additional property
__pclass as a Binary value, with subtype
0x80 and data bearing the fully qualified class name
of the object that is being serialized.
The __pclass property is added to the array or
object returned by
MongoDB\BSON\Serializable::bsonSerialize, which
means it will overwrite any __pclass key/property in
the MongoDB\BSON\Serializable::bsonSerialize
return value. If you want to avoid this behaviour and set your own
__pclass value, you must not
implement MongoDB\BSON\Persistable and
should instead implement
MongoDB\BSON\Serializable directly.
Examples
{ "foo" : 42 }
class MyClass {
public $foo = 42;
protected $prot = "wine";
private $fpr = "cheese";
} // => { "foo" : 42 }
class AnotherClass1 implements MongoDB\BSON\Serializable {
public $foo = 42;
protected $prot = "wine";
private $fpr = "cheese";
function bsonSerialize() {
return [ 'foo' => $this->foo, 'prot' => $this->prot ];
}
} // => { "foo" : 42, "prot" : "wine" }
class AnotherClass2 implements MongoDB\BSON\Serializable {
public $foo = 42;
function bsonSerialize() {
return $this;
}
} // => MongoDB\Driver\Exception\UnexpectedValueException("bsonSerialize() did not return an array or stdClass")
class AnotherClass3 implements MongoDB\BSON\Serializable {
private $elements = [ 'foo', 'bar' ];
function bsonSerialize() {
return $this->elements;
}
} // => { "0" : "foo", "1" : "bar" }
class ContainerClass implements MongoDB\BSON\Serializable {
public $things = AnotherClass4 implements MongoDB\BSON\Serializable {
private $elements = [ 0 => 'foo', 2 => 'bar' ];
function bsonSerialize() {
return $this->elements;
}
}
function bsonSerialize() {
return [ 'things' => $this->things ];
}
} // => { "things" : { "0" : "foo", "2" : "bar" } }
class ContainerClass implements MongoDB\BSON\Serializable {
public $things = AnotherClass5 implements MongoDB\BSON\Serializable {
private $elements = [ 0 => 'foo', 2 => 'bar' ];
function bsonSerialize() {
return array_values($this->elements);
}
}
function bsonSerialize() {
return [ 'things' => $this->things ];
}
} // => { "things" : [ "foo", "bar" ] }
class ContainerClass implements MongoDB\BSON\Serializable {
public $things = AnotherClass6 implements MongoDB\BSON\Serializable {
private $elements = [ 'foo', 'bar' ];
function bsonSerialize() {
return (object) $this->elements;
}
}
function bsonSerialize() {
return [ 'things' => $this->things ];
}
} // => { "things" : { "0" : "foo", "1" : "bar" } }
class UpperClass implements MongoDB\BSON\Persistable {
public $foo = 42;
protected $prot = "wine";
private $fpr = "cheese";
function bsonSerialize() {
return [ 'foo' => $this->foo, 'prot' => $this->prot ];
}
} // => { "foo" : 42, "prot" : "wine", "__pclass" : { "$type" : "80", "$binary" : "VXBwZXJDbGFzcw==" } }
]]>
Deserialization from BSON
The legacy mongo extension deserialized
both BSON documents and arrays as PHP arrays. While PHP arrays are
convenient to work with, this behavior was problematic because different
BSON types could deserialize to the same PHP value (e.g.
{"0": "foo"} and ["foo"]) and make it
impossible to infer the original BSON type. By default, the current driver
addresses this concern by ensuring that BSON arrays and documents are
converted to PHP arrays and objects, respectively.
For compound types, there are three data types:
root
refers to the top-level BSON document onlydocument
refers to embedded BSON documents onlyarray
refers to a BSON array
Each of those three data types can be mapped against different PHP types.
The possible mapping values are:
not set or NULL (default)
A BSON array will be deserialized as a PHP array.
A BSON document (root or embedded) without a
__pclass property
A __pclass property is only deemed to exist if
there exists a property with that name, and it is a Binary value,
and the sub-type of the Binary value is 0x80. If any of these three
conditions is not met, the __pclass property does not exist and
should be treated as any other normal property.
becomes a PHP stdClass object, with each
BSON document key set as a public stdClass
property.
A BSON document (root or embedded) with a
__pclass property becomes a PHP object of
the class name as defined by the __pclass
property.
If the named class implements the
MongoDB\BSON\Persistable interface,
then the properties of the BSON document, including the
__pclass property, are sent as an associative
array to the
MongoDB\BSON\Unserializable::bsonUnserialize
function to initialise the object's properties.
If the named class does not exist or does not implement the
MongoDB\BSON\Persistable interface,
stdClass will be used and each BSON document
key (including __pclass) will be set as a
public stdClass property.
"array"
Turns a BSON array or BSON document into a PHP array. There will be no
special treatment of a __pclass property ,
but it may be set as an element in the returned array if it was
present in the BSON document.
"object" or "stdClass"
Turns a BSON array or BSON document into a
stdClass object. There will be no special
treatment of a __pclass property , but it may
be set as a public property in the returned object if it was present
in the BSON document.
any other string
Defines the class name that the BSON array or BSON object should be
deserialized as. For BSON objects that include
__pclass properties, that class will take
priority.
If the named class does not exist, is not concrete (i.e. it is
abstract or an interface), or does not implement
MongoDB\BSON\Unserializable then an
MongoDB\Driver\Exception\InvalidArgumentException
exception is thrown.
If the BSON object has a __pclass property and
that class exists and implements
MongoDB\BSON\Persistable it will
supersede the class provided in the type map.
The properties of the BSON document, including
the __pclass property if it exists, will be sent
as an associative array to the
MongoDB\BSON\Unserializable::bsonUnserialize
function to initialise the object's properties.
TypeMaps
TypeMaps can be set through the
MongoDB\Driver\Cursor::setTypeMap method on a
MongoDB\Driver\Cursor object, or the
$typeMap argument of
MongoDB\BSON\toPHP. Each of the three
classes (root, document and
array) can be individually set.
If the value in the map is NULL, it means the same as the
default value for that item.
Examples
These examples use the following classes:
MyClass
which does not implement any interface
YourClass
which implements
MongoDB\BSON\UnserializableOurClass
which implements
MongoDB\BSON\PersistableTheirClass
which extends OurClass
The MongoDB\BSON\Unserializable::bsonUnserialize
method of YourClass, OurClass, TheirClass iterate over the array and set
the properties without modifications. It also sets
the $unserialized property to true:
$value )
{
$this->$k = $value;
}
$this->unserialized = true;
}
]]>
stdClass { $foo => 'yes', $bar => false }
{ "foo": "no", "array" : [ 5, 6 ] }
-> stdClass { $foo => 'no', $array => [ 5, 6 ] }
{ "foo": "no", "obj" : { "embedded" : 3.14 } }
-> stdClass { $foo => 'no', $obj => stdClass { $embedded => 3.14 } }
{ "foo": "yes", "__pclass": "MyClass" }
-> stdClass { $foo => 'yes', $__pclass => 'MyClass' }
{ "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "MyClass" } }
-> stdClass { $foo => 'yes', $__pclass => Binary(0x80, 'MyClass') }
{ "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "YourClass") }
-> stdClass { $foo => 'yes', $__pclass => Binary(0x80, 'YourClass') }
{ "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "OurClass") }
-> OurClass { $foo => 'yes', $__pclass => Binary(0x80, 'OurClass'), $unserialized => true }
{ "foo": "yes", "__pclass": { "$type" : "44", "$binary" : "YourClass") }
-> stdClass { $foo => 'yes', $__pclass => Binary(0x44, 'YourClass') }
]]>
"MissingClass" ] */
{ "foo": "yes" }
-> MongoDB\Driver\Exception\InvalidArgumentException("MissingClass does not exist")
/* typemap: [ "root" => "MyClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } }
-> MongoDB\Driver\Exception\InvalidArgumentException("MyClass does not implement Unserializable interface")
/* typemap: [ "root" => "MongoDB\BSON\Unserializable" ] */
{ "foo": "yes" }
-> MongoDB\Driver\Exception\InvalidArgumentException("Unserializable is not a concrete class")
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MongoDB\BSON\Unserializable" } }
-> YourClass { $foo => "yes", $__pclass => Binary(0x80, "MongoDB\BSON\Unserializable"), $unserialized => true }
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } }
-> YourClass { $foo => "yes", $__pclass => Binary(0x80, "MyClass"), $unserialized => true }
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "OurClass" } }
-> OurClass { $foo => "yes", $__pclass => Binary(0x80, "OurClass"), $unserialized => true }
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "TheirClass" } }
-> TheirClass { $foo => "yes", $__pclass => Binary(0x80, "TheirClass"), $unserialized => true }
/* typemap: [ "root" => "OurClass" ] */
{ foo: "yes", "__pclass" : { "$type": "80", "$binary": "TheirClass" } }
-> TheirClass { $foo => "yes", $__pclass => Binary(0x80, "TheirClass"), $unserialized => true }
]]>
'YourClass' ] */
{ foo: "yes", "__pclass" : { "$type": "80", "$binary": "YourClass" } }
-> YourClass { $foo => 'yes', $__pclass => Binary(0x80, 'YourClass'), $unserialized => true }
]]>
'array', 'document' => 'array' ] */
{ "foo": "yes", "bar" : false }
-> [ "foo" => "yes", "bar" => false ]
{ "foo": "no", "array" : [ 5, 6 ] }
-> [ "foo" => "no", "array" => [ 5, 6 ] ]
{ "foo": "no", "obj" : { "embedded" : 3.14 } }
-> [ "foo" => "no", "obj" => [ "embedded => 3.14 ] ]
{ "foo": "yes", "__pclass": "MyClass" }
-> [ "foo" => "yes", "__pclass" => "MyClass" ]
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } }
-> [ "foo" => "yes", "__pclass" => Binary(0x80, "MyClass") ]
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "OurClass" } }
-> [ "foo" => "yes", "__pclass" => Binary(0x80, "OurClass") ]
]]>
'object', 'document' => 'object' ] */
{ "foo": "yes", "__pclass": { "$type": "80", "$binary": "MyClass" } }
-> stdClass { $foo => "yes", "__pclass" => Binary(0x80, "MyClass") }
]]>