Driver Architecture and InternalsExplains the driver architecure, and special featuresArchitectureArchitecture Overview
This section explains how all the different parts of the driver fit
together. From the different language runtimes, through the extensions and
to the PHP libraries on top. This new architecture has replaced the old
mongo extension. We refer to the new one
as the mongodb extension.
Architecture Diagram
At the top of this stack sits a pure
PHP library, which we will
distribute as a Composer package. This library will provide an API similar
to what users have come to expect from the old mongo driver (e.g. CRUD methods,
database and collection objects, command helpers) and we expect it to be a
common dependency for most applications built with MongoDB. This library
will also implement common
specifications, in the
interest of improving API consistency across all of the
drivers maintained by
MongoDB (and hopefully some community drivers, too).
Sitting below that library we have the lower level drivers—one per platform.
These extensions will effectively form the glue between PHP and HHVM and our
system libraries (libmongoc and
libbson). These extensions
will expose an identical public API for the most essential and
performance-sensitive functionality:
Connection managementBSON encoding and decodingObject document serialization (to support ODM libraries)Executing commands and write operationsHandling queries and cursors
By decoupling the driver internals and a high-level API into extensions and
PHP libraries, respectively, we hope to reduce our maintainence burden and
allow for faster iteration on new features. As a welcome side effect, this
also makes it easier for anyone to contribute to the driver. Additionally,
an identical public API for these extensions will make it that much easier
to port an application across PHP runtimes, whether the application uses
the low-level driver directly or a higher-level PHP library.
GridFS is a great example
of why we chose this direction.
Although we implemented GridFS in C for our old mongo driver, it is actually
quite a high-level specification. Its API is just an abstraction for
accessing two collections: files (i.e. metadata) and chunks (i.e. blocks of
data). Likewise, all of the syntactic sugar found in the old mongo driver,
such as processing uploaded files or exposing GridFS files as PHP streams,
can be implemented in pure PHP. Provided we have performant methods for
reading from and writing to GridFS' collections – and thanks to our low
level extensions, we will – shifting this API to PHP is win-win.
Earlier I mentioned that we expect the PHP library to be a common
dependency for most applications, but not
all. Some users may prefer to stick to the no-frills
API offered by the extensions, or create their own high-level abstraction
(akin to Doctrine MongoDB for
the old mongo driver). Future libraries could include a PHP library geared
for MongoDB administration, which provides an API for various user
management and ops commands. The next major version of
Doctrine MongoDB ODM will
likely also sit directly atop the extensions.
While we will continue to maintain and support the old mongo driver and its
users for the foreseeable future, we invite everyone to use the
next-generation driver and consider it for any new projects going forward.
You can find all of the essential components across GitHub and JIRA:
Driver Source Code and JIRA LocationsProjectGitHubJIRAPHP Librarymongodb/mongo-php-libraryPHPLIBPHP 5 and PHP 7 Driver (phongo)mongodb/mongo-php-driverPHPCHHVM Driver (hippo)mongodb/mongo-hhvm-driverHHVM
The existing PHP project in JIRA
will remain open for reporting bugs against the old mongo driver, but we
would ask that you use the new projects above for anything pertaining to
our next-generation drivers.
Persisting DataSerialisation and deserialisation of PHP variables into MongoDB
This document discusses the methods how compound structures (documents,
arrays, objects) are persisted through the drivers. And how they are brought
back into PHP land.
Serialisation to BSONArrays
If an array is a packed array — i.e. the keys start
at 0 and are sequential without gaps: BSON array.
If the array is not packed — i.e. having associative (string) keys, the
keys don't start at 0, or when there are gaps:: BSON
object
A top-level (root) document, always serializes as a
BSON document.
Examples
These serialize as a BSON array:
[ 8, 5, 2, 3 ] => [ 8, 5, 2, 3 ]
[ 0 => 4, 1 => 9 ] => [ 4, 9 ]
These serialize as a BSON document:
[ 0 => 1, 2 => 8, 3 => 12 ] => { "0" : 1, "2" : 8, "3" : 12 }
[ "foo" => 42 ] => { "foo" : 42 }
[ 1 => 9, 0 => 10 ] => { "1" : 9, "0" : 10 }
Note that the five examples are extracts of a full
document, and represent only one value inside a
document.
Objects
If an object is of the stdClass class, serialize
as a BSON document.
If an object is a supported class that implements
MongoDB\BSON\Type, then use the BSON
serialization logic for that specific type.
MongoDB\BSON\Type instances (excluding
MongoDB\BSON\Serializable may only be serialized
as a document field value. Attempting to serialize such an object as a
root document will throw a
MongoDB\Driver\Exception\UnexpectedValueException
If an object is of an unknown class implementing the
MongoDB\BSON\Type interface, then throw a
MongoDB\Driver\Exception\UnexpectedValueException
If an object is of any other class, without implementing any special
interface, serialize as a BSON document. Keep only
public properties, and ignore
protected and private
properties.
If an object is of a class that implements the
MongoDB\BSON\Serializable interface, call
MongoDB\BSON\Serializable::bsonSerialize and use
the returned array or stdClass to serialize as a
BSON document or array. The BSON type will be determined by the following:
Root documents must be serialized as a BSON
document.
MongoDB\BSON\Persistable objects must be
serialized as a BSON document.
If MongoDB\BSON\Serializable::bsonSerialize
returns a packed array, serialize as a BSON array.
If MongoDB\BSON\Serializable::bsonSerialize
returns a non-packed array or stdClass,
serialize as a BSON document.
If MongoDB\BSON\Serializable::bsonSerialize
did not return an array or stdClass, throw an
MongoDB\Driver\Exception\UnexpectedValueException
exception.
If an object is of a class that implements the
MongoDB\BSON\Persistable interface (which implies
MongoDB\BSON\Serializable obtain the properties
in a similar way as in the previous paragraphs, but
also add an additional property
__pclass as a Binary value, with subtype
0x80 and data bearing the fully qualified class name
of the object that is being serialized.
The __pclass property is added to the array or
object returned by
MongoDB\BSON\Serializable::bsonSerialize, which
means it will overwrite any __pclass key/property in
the MongoDB\BSON\Serializable::bsonSerialize
return value. If you want to avoid this behaviour and set your own
__pclass value, you must not
implement MongoDB\BSON\Persistable and should
instead implement MongoDB\BSON\Serializable
directly.
Examples
stdClass {
public $foo = 42;
} => { "foo" : 42 }
MyClass {
public $foo = 42;
protected $prot = "wine";
private $fpr = "cheese";
} => { "foo" : 42 }
AnotherClass1 implements MongoDB\BSON\Serializable {
public $foo = 42;
protected $prot = "wine";
private $fpr = "cheese";
function bsonSerialize() {
return [ 'foo' => $this->foo, 'prot' => $this->prot ];
}
} => { "foo" : 42, "prot" : "wine" }
AnotherClass2 implements MongoDB\BSON\Serializable {
public $foo = 42;
function bsonSerialize() {
return $this;
}
} => MongoDB\Driver\Exception\UnexpectedValueException("bsonSerialize() did not return an array or stdClass")
AnotherClass3 implements MongoDB\BSON\Serializable {
private $elements = [ 'foo', 'bar' ];
function bsonSerialize() {
return $this->elements;
}
} => { "0" : "foo", "1" : "bar" }
ContainerClass implements MongoDB\BSON\Serializable {
public $things = AnotherClass4 implements MongoDB\BSON\Serializable {
private $elements = [ 0 => 'foo', 2 => 'bar' ];
function bsonSerialize() {
return $this->elements;
}
}
function bsonSerialize() {
return [ 'things' => $this->things ];
}
} => { "things" : { "0" : "foo", "2" : "bar" } }
ContainerClass implements MongoDB\BSON\Serializable {
public $things = AnotherClass5 implements MongoDB\BSON\Serializable {
private $elements = [ 0 => 'foo', 2 => 'bar' ];
function bsonSerialize() {
return array_values($this->elements);
}
}
function bsonSerialize() {
return [ 'things' => $this->things ];
}
} => { "things" : [ "foo", "bar" ] }
ContainerClass implements MongoDB\BSON\Serializable {
public $things = AnotherClass6 implements MongoDB\BSON\Serializable {
private $elements = [ 'foo', 'bar' ];
function bsonSerialize() {
return (object) $this->elements;
}
}
function bsonSerialize() {
return [ 'things' => $this->things ];
}
} => { "things" : { "0" : "foo", "1" : "bar" } }
UpperClass implements MongoDB\BSON\Persistable {
public $foo = 42;
protected $prot = "wine";
private $fpr = "cheese";
function bsonSerialize() {
return [ 'foo' => $this->foo, 'prot' => $this->prot ];
}
} => { "foo" : 42, "prot" : "wine", "__pclass" : { "$type" : "80", "$binary" : "VXBwZXJDbGFzcw==" } }
Deserialization from BSON
For compound types, there are three data types:
root
refers to the top-level BSON document onlydocument
refers to embedded BSON documents onlyarray
refers to a BSON array
Each of those three data types can be mapped against different PHP types.
The possible mapping values are:
not set or NULL (the is the
default)
A BSON array will be deserialized as a PHP array.
A BSON document (root or embedded) without a
__pclass property
A __pclass property is only deemed to exist if
there exists a property with that name, and it is a Binary value,
and the sub-type of the Binary value is 0x80. If any of these three
conditions is not met, the __pclass property does not exist and
should be treated as any other normal property.
becomes a PHP stdClass object, with each
BSON document key set as a public stdClass
property.
A BSON document (root or embedded) with a
__pclass property becomes a PHP object of
the class name as defined by the __pclass
property.
If the named class implements the
MongoDB\BSON\Persistable interface, then the
properties of the BSON document, including the
__pclass property, are sent as an associative
array to the
MongoDB\BSON\Unserializable::bsonUnserialize
function to initialise the object's properties.
If the named class does not exist or does not implement the
MongoDB\BSON\Persistable interface,
stdClass will be used and each BSON document
key (including __pclass) will be set as a
public stdClass property.
"array"
Turns a BSON array or BSON document into a PHP array. There will be no
special treatment of a __pclass property ,
but it may be set as an element in the returned array if it was
present in the BSON document.
"object" or "stdClass"
Turns a BSON array or BSON document into a
stdClass object. There will be no special
treatment of a __pclass property , but it may
be set as a public property in the returned object if it was present
in the BSON document.
any other string
Defines the class name that the BSON array or BSON object should be
deserialized as. For BSON objects that include
__pclass properties, that class will take
priority.
If the named class does not exist, is not concrete (i.e. it is
abstract or an interface), or does not implement
MongoDB\BSON\Unserializable then an
MongoDB\Driver\Exception\InvalidArgumentException
exception is thrown.
If the BSON object has a __pclass property and
that class exists and implements
MongoDB\BSON\Persistable it will supersede the
class provided in the type map.
The properties of the BSON document, including
the __pclass property if it exists, will be sent
as an associative array to the
MongoDB\BSON\Unserializable::bsonUnserialize
function to initialise the object's properties.
TypeMaps
TypeMaps can be set through the
MongoDB\Driver\Cursor::setTypeMap method on a
MongoDB\Driver\Cursor object, or the
$typeMap argument of
MongoDB\BSON\toPHP. Each of the three
classes (root, document and
array) can be individually set.
If the value in the map is NULL, it means the same as the
default value for that item.
Examples
These examples use the following classes:
MyClass
which does not implement any interface
YourClass
which implements MongoDB\BSON\UnserializableOurClass
which implements MongoDB\BSON\PersistableTheirClass
which extends OurClass
The MongoDB\BSON\Unserializable::bsonUnserialize
method of YourClass, OurClass, TheirClass iterate over the array and set
the properties without modifications. It also sets
the $unserialized property to true:
function bsonUnserialize( array $map )
{
foreach ( $map as $k => $value )
{
$this->$k = $value;
}
$this->unserialized = true;
}
/* typemap: [] (all defaults) */
{ "foo": "yes", "bar" : false }
-> stdClass { $foo => 'yes', $bar => false }
{ "foo": "no", "array" : [ 5, 6 ] }
-> stdClass { $foo => 'no', $array => [ 5, 6 ] }
{ "foo": "no", "obj" : { "embedded" : 3.14 } }
-> stdClass { $foo => 'no', $obj => stdClass { $embedded => 3.14 } }
{ "foo": "yes", "__pclass": "MyClass" }
-> stdClass { $foo => 'yes', $__pclass => 'MyClass' }
{ "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "MyClass" } }
-> stdClass { $foo => 'yes', $__pclass => Binary(0x80, 'MyClass') }
{ "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "YourClass") }
-> stdClass { $foo => 'yes', $__pclass => Binary(0x80, 'YourClass') }
{ "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "OurClass") }
-> OurClass { $foo => 'yes', $__pclass => Binary(0x80, 'OurClass'), $unserialized => true }
{ "foo": "yes", "__pclass": { "$type" : "44", "$binary" : "YourClass") }
-> stdClass { $foo => 'yes', $__pclass => Binary(0x44, 'YourClass') }
/* typemap: [ "root" => "MissingClass" ] */
{ "foo": "yes" }
-> MongoDB\Driver\Exception\InvalidArgumentException("MissingClass does not exist")
/* typemap: [ "root" => "MyClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } }
-> MongoDB\Driver\Exception\InvalidArgumentException("MyClass does not implement Unserializable interface")
/* typemap: [ "root" => "MongoDB\BSON\Unserializable" ] */
{ "foo": "yes" }
-> MongoDB\Driver\Exception\InvalidArgumentException("Unserializable is not a concrete class")
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MongoDB\BSON\Unserializable" } }
-> YourClass { $foo => "yes", $__pclass => Binary(0x80, "MongoDB\BSON\Unserializable"), $unserialized => true }
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } }
-> YourClass { $foo => "yes", $__pclass => Binary(0x80, "MyClass"), $unserialized => true }
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "OurClass" } }
-> OurClass { $foo => "yes", $__pclass => Binary(0x80, "OurClass"), $unserialized => true }
/* typemap: [ "root" => "YourClass" ] */
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "TheirClass" } }
-> TheirClass { $foo => "yes", $__pclass => Binary(0x80, "TheirClass"), $unserialized => true }
/* typemap: [ "root" => "OurClass" ] */
{ foo: "yes", "__pclass" : { "$type": "80", "$binary": "TheirClass" } }
-> TheirClass { $foo => "yes", $__pclass => Binary(0x80, "TheirClass"), $unserialized => true }
/* typemap: [ 'root' => 'YourClass' ] */
{ foo: "yes", "__pclass" : { "$type": "80", "$binary": "YourClass" } }
-> YourClass { $foo => 'yes', $__pclass => Binary(0x80, 'YourClass'), $unserialized => true }
/* typemap: [ 'root' => 'array', 'document' => 'array' ] */
{ "foo": "yes", "bar" : false }
-> [ "foo" => "yes", "bar" => false ]
{ "foo": "no", "array" : [ 5, 6 ] }
-> [ "foo" => "no", "array" => [ 5, 6 ] ]
{ "foo": "no", "obj" : { "embedded" : 3.14 } }
-> [ "foo" => "no", "obj" => [ "embedded => 3.14 ] ]
{ "foo": "yes", "__pclass": "MyClass" }
-> [ "foo" => "yes", "__pclass" => "MyClass" ]
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } }
-> [ "foo" => "yes", "__pclass" => Binary(0x80, "MyClass") ]
{ "foo": "yes", "__pclass" : { "$type": "80", "$binary": "OurClass" } }
-> [ "foo" => "yes", "__pclass" => Binary(0x80, "OurClass") ]
/* typemap: [ 'root' => 'object', 'document' => 'object' ] */
{ "foo": "yes", "__pclass": { "$type": "80", "$binary": "MyClass" } }
-> stdClass { $foo => "yes", "__pclass" => Binary(0x80, "MyClass") }