Driver Architecture and Internals Explains the driver architecure, and special features
Architecture Architecture Overview This section explains how all the different parts of the driver fit together. From the different language runtimes, through the extensions and to the PHP libraries on top. This new architecture has replaced the old mongo extension. We refer to the new one as the mongodb extension. Architecture Diagram At the top of this stack sits a pure PHP library, which we will distribute as a Composer package. This library will provide an API similar to what users have come to expect from the old mongo driver (e.g. CRUD methods, database and collection objects, command helpers) and we expect it to be a common dependency for most applications built with MongoDB. This library will also implement common specifications, in the interest of improving API consistency across all of the drivers maintained by MongoDB (and hopefully some community drivers, too). Sitting below that library we have the lower level drivers—one per platform. These extensions will effectively form the glue between PHP and HHVM and our system libraries (libmongoc and libbson). These extensions will expose an identical public API for the most essential and performance-sensitive functionality: Connection management BSON encoding and decoding Object document serialization (to support ODM libraries) Executing commands and write operations Handling queries and cursors By decoupling the driver internals and a high-level API into extensions and PHP libraries, respectively, we hope to reduce our maintainence burden and allow for faster iteration on new features. As a welcome side effect, this also makes it easier for anyone to contribute to the driver. Additionally, an identical public API for these extensions will make it that much easier to port an application across PHP runtimes, whether the application uses the low-level driver directly or a higher-level PHP library. GridFS is a great example of why we chose this direction. Although we implemented GridFS in C for our old mongo driver, it is actually quite a high-level specification. Its API is just an abstraction for accessing two collections: files (i.e. metadata) and chunks (i.e. blocks of data). Likewise, all of the syntactic sugar found in the old mongo driver, such as processing uploaded files or exposing GridFS files as PHP streams, can be implemented in pure PHP. Provided we have performant methods for reading from and writing to GridFS' collections – and thanks to our low level extensions, we will – shifting this API to PHP is win-win. Earlier I mentioned that we expect the PHP library to be a common dependency for most applications, but not all. Some users may prefer to stick to the no-frills API offered by the extensions, or create their own high-level abstraction (akin to Doctrine MongoDB for the old mongo driver). Future libraries could include a PHP library geared for MongoDB administration, which provides an API for various user management and ops commands. The next major version of Doctrine MongoDB ODM will likely also sit directly atop the extensions. While we will continue to maintain and support the old mongo driver and its users for the foreseeable future, we invite everyone to use the next-generation driver and consider it for any new projects going forward. You can find all of the essential components across GitHub and JIRA: Driver Source Code and JIRA Locations Project GitHub JIRA PHP Library mongodb/mongo-php-library PHPLIB PHP 5 and PHP 7 Driver (phongo) mongodb/mongo-php-driver PHPC HHVM Driver (hippo) mongodb/mongo-hhvm-driver HHVM
The existing PHP project in JIRA will remain open for reporting bugs against the old mongo driver, but we would ask that you use the new projects above for anything pertaining to our next-generation drivers.
Persisting Data Serialisation and deserialisation of PHP variables into MongoDB This document discusses the methods how compound structures (documents, arrays, objects) are persisted through the drivers. And how they are brought back into PHP land.
Serialisation to BSON
Arrays If an array is a packed array — i.e. the keys start at 0 and are sequential without gaps: BSON array. If the array is not packed — i.e. having associative (string) keys, the keys don't start at 0, or when there are gaps:: BSON object A top-level (root) document, always serializes as a BSON document.
Examples These serialize as a BSON array: [ 8, 5, 2, 3 ] => [ 8, 5, 2, 3 ] [ 0 => 4, 1 => 9 ] => [ 4, 9 ] These serialize as a BSON document: [ 0 => 1, 2 => 8, 3 => 12 ] => { "0" : 1, "2" : 8, "3" : 12 } [ "foo" => 42 ] => { "foo" : 42 } [ 1 => 9, 0 => 10 ] => { "1" : 9, "0" : 10 } Note that the five examples are extracts of a full document, and represent only one value inside a document.
Objects If an object is of the stdClass class, serialize as a BSON document. If an object is a supported class that implements MongoDB\BSON\Type, then use the BSON serialization logic for that specific type. MongoDB\BSON\Type instances (excluding MongoDB\BSON\Serializable may only be serialized as a document field value. Attempting to serialize such an object as a root document will throw a MongoDB\Driver\Exception\UnexpectedValueException If an object is of an unknown class implementing the MongoDB\BSON\Type interface, then throw a MongoDB\Driver\Exception\UnexpectedValueException If an object is of any other class, without implementing any special interface, serialize as a BSON document. Keep only public properties, and ignore protected and private properties. If an object is of a class that implements the MongoDB\BSON\Serializable interface, call MongoDB\BSON\Serializable::bsonSerialize and use the returned array or stdClass to serialize as a BSON document or array. The BSON type will be determined by the following: Root documents must be serialized as a BSON document. MongoDB\BSON\Persistable objects must be serialized as a BSON document. If MongoDB\BSON\Serializable::bsonSerialize returns a packed array, serialize as a BSON array. If MongoDB\BSON\Serializable::bsonSerialize returns a non-packed array or stdClass, serialize as a BSON document. If MongoDB\BSON\Serializable::bsonSerialize did not return an array or stdClass, throw an MongoDB\Driver\Exception\UnexpectedValueException exception. If an object is of a class that implements the MongoDB\BSON\Persistable interface (which implies MongoDB\BSON\Serializable obtain the properties in a similar way as in the previous paragraphs, but also add an additional property __pclass as a Binary value, with subtype 0x80 and data bearing the fully qualified class name of the object that is being serialized. The __pclass property is added to the array or object returned by MongoDB\BSON\Serializable::bsonSerialize, which means it will overwrite any __pclass key/property in the MongoDB\BSON\Serializable::bsonSerialize return value. If you want to avoid this behaviour and set your own __pclass value, you must not implement MongoDB\BSON\Persistable and should instead implement MongoDB\BSON\Serializable directly.
Examples stdClass { public $foo = 42; } => { "foo" : 42 } MyClass { public $foo = 42; protected $prot = "wine"; private $fpr = "cheese"; } => { "foo" : 42 } AnotherClass1 implements MongoDB\BSON\Serializable { public $foo = 42; protected $prot = "wine"; private $fpr = "cheese"; function bsonSerialize() { return [ 'foo' => $this->foo, 'prot' => $this->prot ]; } } => { "foo" : 42, "prot" : "wine" } AnotherClass2 implements MongoDB\BSON\Serializable { public $foo = 42; function bsonSerialize() { return $this; } } => MongoDB\Driver\Exception\UnexpectedValueException("bsonSerialize() did not return an array or stdClass") AnotherClass3 implements MongoDB\BSON\Serializable { private $elements = [ 'foo', 'bar' ]; function bsonSerialize() { return $this->elements; } } => { "0" : "foo", "1" : "bar" } ContainerClass implements MongoDB\BSON\Serializable { public $things = AnotherClass4 implements MongoDB\BSON\Serializable { private $elements = [ 0 => 'foo', 2 => 'bar' ]; function bsonSerialize() { return $this->elements; } } function bsonSerialize() { return [ 'things' => $this->things ]; } } => { "things" : { "0" : "foo", "2" : "bar" } } ContainerClass implements MongoDB\BSON\Serializable { public $things = AnotherClass5 implements MongoDB\BSON\Serializable { private $elements = [ 0 => 'foo', 2 => 'bar' ]; function bsonSerialize() { return array_values($this->elements); } } function bsonSerialize() { return [ 'things' => $this->things ]; } } => { "things" : [ "foo", "bar" ] } ContainerClass implements MongoDB\BSON\Serializable { public $things = AnotherClass6 implements MongoDB\BSON\Serializable { private $elements = [ 'foo', 'bar' ]; function bsonSerialize() { return (object) $this->elements; } } function bsonSerialize() { return [ 'things' => $this->things ]; } } => { "things" : { "0" : "foo", "1" : "bar" } } UpperClass implements MongoDB\BSON\Persistable { public $foo = 42; protected $prot = "wine"; private $fpr = "cheese"; function bsonSerialize() { return [ 'foo' => $this->foo, 'prot' => $this->prot ]; } } => { "foo" : 42, "prot" : "wine", "__pclass" : { "$type" : "80", "$binary" : "VXBwZXJDbGFzcw==" } }
Deserialization from BSON For compound types, there are three data types: root refers to the top-level BSON document only document refers to embedded BSON documents only array refers to a BSON array Each of those three data types can be mapped against different PHP types. The possible mapping values are: not set or NULL (the is the default) A BSON array will be deserialized as a PHP array. A BSON document (root or embedded) without a __pclass property A __pclass property is only deemed to exist if there exists a property with that name, and it is a Binary value, and the sub-type of the Binary value is 0x80. If any of these three conditions is not met, the __pclass property does not exist and should be treated as any other normal property. becomes a PHP stdClass object, with each BSON document key set as a public stdClass property. A BSON document (root or embedded) with a __pclass property becomes a PHP object of the class name as defined by the __pclass property. If the named class implements the MongoDB\BSON\Persistable interface, then the properties of the BSON document, including the __pclass property, are sent as an associative array to the MongoDB\BSON\Unserializable::bsonUnserialize function to initialise the object's properties. If the named class does not exist or does not implement the MongoDB\BSON\Persistable interface, stdClass will be used and each BSON document key (including __pclass) will be set as a public stdClass property. "array" Turns a BSON array or BSON document into a PHP array. There will be no special treatment of a __pclass property , but it may be set as an element in the returned array if it was present in the BSON document. "object" or "stdClass" Turns a BSON array or BSON document into a stdClass object. There will be no special treatment of a __pclass property , but it may be set as a public property in the returned object if it was present in the BSON document. any other string Defines the class name that the BSON array or BSON object should be deserialized as. For BSON objects that include __pclass properties, that class will take priority. If the named class does not exist, is not concrete (i.e. it is abstract or an interface), or does not implement MongoDB\BSON\Unserializable then an MongoDB\Driver\Exception\InvalidArgumentException exception is thrown. If the BSON object has a __pclass property and that class exists and implements MongoDB\BSON\Persistable it will supersede the class provided in the type map. The properties of the BSON document, including the __pclass property if it exists, will be sent as an associative array to the MongoDB\BSON\Unserializable::bsonUnserialize function to initialise the object's properties.
TypeMaps TypeMaps can be set through the MongoDB\Driver\Cursor::setTypeMap method on a MongoDB\Driver\Cursor object, or the $typeMap argument of MongoDB\BSON\toPHP. Each of the three classes (root, document and array) can be individually set. If the value in the map is NULL, it means the same as the default value for that item.
Examples These examples use the following classes: MyClass which does not implement any interface YourClass which implements MongoDB\BSON\Unserializable OurClass which implements MongoDB\BSON\Persistable TheirClass which extends OurClass The MongoDB\BSON\Unserializable::bsonUnserialize method of YourClass, OurClass, TheirClass iterate over the array and set the properties without modifications. It also sets the $unserialized property to true: function bsonUnserialize( array $map ) { foreach ( $map as $k => $value ) { $this->$k = $value; } $this->unserialized = true; } /* typemap: [] (all defaults) */ { "foo": "yes", "bar" : false } -> stdClass { $foo => 'yes', $bar => false } { "foo": "no", "array" : [ 5, 6 ] } -> stdClass { $foo => 'no', $array => [ 5, 6 ] } { "foo": "no", "obj" : { "embedded" : 3.14 } } -> stdClass { $foo => 'no', $obj => stdClass { $embedded => 3.14 } } { "foo": "yes", "__pclass": "MyClass" } -> stdClass { $foo => 'yes', $__pclass => 'MyClass' } { "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "MyClass" } } -> stdClass { $foo => 'yes', $__pclass => Binary(0x80, 'MyClass') } { "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "YourClass") } -> stdClass { $foo => 'yes', $__pclass => Binary(0x80, 'YourClass') } { "foo": "yes", "__pclass": { "$type" : "80", "$binary" : "OurClass") } -> OurClass { $foo => 'yes', $__pclass => Binary(0x80, 'OurClass'), $unserialized => true } { "foo": "yes", "__pclass": { "$type" : "44", "$binary" : "YourClass") } -> stdClass { $foo => 'yes', $__pclass => Binary(0x44, 'YourClass') } /* typemap: [ "root" => "MissingClass" ] */ { "foo": "yes" } -> MongoDB\Driver\Exception\InvalidArgumentException("MissingClass does not exist") /* typemap: [ "root" => "MyClass" ] */ { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } } -> MongoDB\Driver\Exception\InvalidArgumentException("MyClass does not implement Unserializable interface") /* typemap: [ "root" => "MongoDB\BSON\Unserializable" ] */ { "foo": "yes" } -> MongoDB\Driver\Exception\InvalidArgumentException("Unserializable is not a concrete class") /* typemap: [ "root" => "YourClass" ] */ { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MongoDB\BSON\Unserializable" } } -> YourClass { $foo => "yes", $__pclass => Binary(0x80, "MongoDB\BSON\Unserializable"), $unserialized => true } /* typemap: [ "root" => "YourClass" ] */ { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } } -> YourClass { $foo => "yes", $__pclass => Binary(0x80, "MyClass"), $unserialized => true } /* typemap: [ "root" => "YourClass" ] */ { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "OurClass" } } -> OurClass { $foo => "yes", $__pclass => Binary(0x80, "OurClass"), $unserialized => true } /* typemap: [ "root" => "YourClass" ] */ { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "TheirClass" } } -> TheirClass { $foo => "yes", $__pclass => Binary(0x80, "TheirClass"), $unserialized => true } /* typemap: [ "root" => "OurClass" ] */ { foo: "yes", "__pclass" : { "$type": "80", "$binary": "TheirClass" } } -> TheirClass { $foo => "yes", $__pclass => Binary(0x80, "TheirClass"), $unserialized => true } /* typemap: [ 'root' => 'YourClass' ] */ { foo: "yes", "__pclass" : { "$type": "80", "$binary": "YourClass" } } -> YourClass { $foo => 'yes', $__pclass => Binary(0x80, 'YourClass'), $unserialized => true } /* typemap: [ 'root' => 'array', 'document' => 'array' ] */ { "foo": "yes", "bar" : false } -> [ "foo" => "yes", "bar" => false ] { "foo": "no", "array" : [ 5, 6 ] } -> [ "foo" => "no", "array" => [ 5, 6 ] ] { "foo": "no", "obj" : { "embedded" : 3.14 } } -> [ "foo" => "no", "obj" => [ "embedded => 3.14 ] ] { "foo": "yes", "__pclass": "MyClass" } -> [ "foo" => "yes", "__pclass" => "MyClass" ] { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "MyClass" } } -> [ "foo" => "yes", "__pclass" => Binary(0x80, "MyClass") ] { "foo": "yes", "__pclass" : { "$type": "80", "$binary": "OurClass" } } -> [ "foo" => "yes", "__pclass" => Binary(0x80, "OurClass") ] /* typemap: [ 'root' => 'object', 'document' => 'object' ] */ { "foo": "yes", "__pclass": { "$type": "80", "$binary": "MyClass" } } -> stdClass { $foo => "yes", "__pclass" => Binary(0x80, "MyClass") }