<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision$ -->

<part xml:id="mongo.types" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">

 <title>Types</title>

 <partintro>
  <section>
   <para>
    MongoDB allows programmers to save and query for data expressed in all of the
    basic PHP types, compound types (arrays, associative arrays, and objects), and
    a half-dozen classes provided by the MongoDB PHP driver (for regular
    expressions, dates, and other specialized applications).
   </para>
  </section>

  <section>
   <title>Booleans and &null;</title>
   <para>
    &true;, &false;, and &null; can be used as-is.
   </para>
  </section>

  <section>
   <title>Numbers</title>
   <para>
    Numbers are distinct from strings in MongoDB: "123" does not match 123.
    Thus, if you want to make sure numbers are sorted and matched correctly, you
    must make sure that they are actually saved as numbers.
   </para>
   <programlisting role="php">
<![CDATA[
<?php

$doc = array("a" => 123, "b" => "123");
$collection->insert($doc);

$doc->find(array("a" => 123));   // matches
$doc->find(array("a" => "123")); // doesn't match
$doc->find(array("a" => 123.0)); // matches
$doc->find(array("b" => 123));   // doesn't match
$doc->find(array("b" => "123")); // matches

?>
]]>
   </programlisting>
   <para>
    As noted above, floating point numbers do compare with/match integer numbers
    as one would expect.
   </para>
   <section>
    <title>Large Numbers</title>
    <para>
     By default, on a 32-bit system, numbers are sent to the database as 32-bit
     integers. On a 64-bit system, they are sent as 64-bit integers.  For
     backwards compatibility, all systems deserialize 64-bit integers as floating
     point numbers.  Floating point numbers are not exact.  If you need exact
     values, you must tweak your
     <link linkend="mongo.configuration">php.ini settings</link>.
    </para>
    <para>
     On a 32-bit system, if <literal>mongo.long_as_object</literal> is set,
     64-bit integers will be returns as <classname>MongoInt64</classname>
     objects.  The integer will be stored in the <literal>value</literal> field
     with perfect precision (as a string). You can also use
     <classname>MongoInt64</classname> to save 64-bit integers on 32-bit
     machines.
    </para>
    <para>
     On 64-bit systems, you can either set <literal>mongo.long_as_object</literal>
     or set <literal>mongo.native_long</literal>.
     <literal>mongo.native_long</literal> will return 64-bit integers and
     "normal" PHP integers.  You can use <classname>MongoInt32</classname> to
     save 32-bit integers on 64-bit machines.
    </para>
    <para>
     You should set the <literal>mongo.long_as_object</literal> and
     <literal>mongo.native_long</literal> behavior that you plan to use, even if
     it is the default behavior (to protect against future changes to the
     defaults).
    </para>
    <para>
     See also: <link linkend="mongo.configuration">php.ini Options</link>,
     <classname>MongoInt32</classname>, <classname>MongoInt64</classname>.
    </para>
   </section>
  </section>

  <section>
   <title>Strings</title>
   <para>
    Strings must be UTF-8. Non-UTF-8 strings must either be converted to UTF-8
    before being sent to the database or saved as binary data.
   </para>
   <para>
    Regular expressions can be used to match strings, and are expressed using the
    <classname>MongoRegex</classname> class.
   </para>
  </section>

  <section>
   <title>Binary Data</title>
   <para>
    Non-UTF-8 strings, images, and any other binary data should be sent to the
    database using the <classname>MongoBinData</classname> type.
   </para>
  </section>

  <section>
   <title>Dates</title>
   <para>
    Dates can be created using the <classname>MongoDate</classname> class.  They
    are stored as milliseconds since the epoch.
   </para>
   <para>
    <classname>MongoTimestamp</classname> is not for saving dates or timestamps,
    it is used internally by MongoDB.  Unless you are creating a tool that
    interacts with the internals of replication or sharding, you should use
    <classname>MongoDate</classname>, <emphasis>not</emphasis>
    <classname>MongoTimestamp</classname>.
   </para>
  </section>

  <section>
   <title>Unique Ids</title>
   <para>
    The driver will automatically create an <literal>_id</literal> field before
    inserting a document (unless one is specified by the user).  This field is an
    instance of <classname>MongoId</classname> (called "ObjectId" in most other
    languages).
   </para>
   <para>
    These ids are 12 bytes long and composed of:
    <itemizedlist>
     <listitem>
      <para>4 bytes of timestamp</para>
      <para>
       No two records can have the same id if they were inserted at different
       times.
      </para>
     </listitem>
     <listitem>
      <para>3 bytes machine id</para>
      <para>
       No two records can have the same id if they were inserted on different
       machines
      </para>
     </listitem>
     <listitem>
      <para>2 bytes thread id</para>
      <para>
       No two records can have the same id if they were inserted by different
       threads running on the same machine.
      </para>
     </listitem>
     <listitem>
      <para>3 bytes incrementing value</para>
      <para>
       Each time an id is created, a global counter is incremented and used
       as the increment value of the next id.
      </para>
     </listitem>
    </itemizedlist>
    Thus, no two records can have the same id unless a single process on a
    single machine managed to insert 256^3 (over 16 million) documents in
    one second, overflowing the increment field.
   </para>
  </section>

  <section>
   <title>JavaScript</title>
   <para>
    MongoDB comes with a JavaScript engine, so you can embed JavaScript in
    queries (using a $where clause), send it directly to the database to be
    executed, and use it to perform aggregations.
   </para>
   <para>
    For security, use <classname>MongoCode</classname>'s <literal>scope</literal>
    field to use PHP variables in JavaScript.  Code that does not require
    external values can either use <classname>MongoCode</classname> or just be
    a string.  See the
    <link linkend="mongo.security">section on security</link> for more
    information about sending JavaScript to the database.
   </para>
  </section>

  <section>
   <title>Arrays and Objects</title>

   <para>
    Arrays and objects can also be saved to the database.  An array with ascending
    numeric keys will be saved as a an array, anything else will be saved as an
    object.
   </para>

   <programlisting role="php">
<![CDATA[
<?php

// $scores will be saved as an array
$scores = array(98, 100, 73, 85);
$collection->insert(array("scores" => $scores));

// $scores will be saved as an object
$scores = array("quiz1" => 98, "midterm" => 100, "quiz2" => 73, "final" => 85);
$collection->insert(array("scores" => $scores));

?>
]]>
   </programlisting>

   <para>
    If you query for these objects using the database shell, they will look like:
    <programlisting role="shell">
<![CDATA[
> db.students.find()
{ "_id" : ObjectId("4b06beada9ad6390dab17c43"), "scores" : [ 98, 100, 73, 85 ] }
{ "_id" : ObjectId("4b06bebea9ad6390dab17c44"), "scores" : { "quiz1" : 98, "midterm" : 100, "quiz2" : 73, "final" : 85 } }
]]>
    </programlisting>
   </para>

   <para>
    The database can also save arbitrary PHP objects (although they will be
    returned as associative arrays).  The fields are used for the key/value
    pairs.  For example, a blog post might look like:
    <programlisting role="php">
<![CDATA[
<?php

  // the blog post class
  class Post {

  var $author;
  var $content;
  var $comments = array();
  var $date;

  public function __construct($author, $content) {
  $this->author = $author;
$this->content = $content;
    $this->date = new MongoDate();
  }

  public function setTitle($title) {
    $this->title = $title;
  }
}

// create a simple blog post and insert it into the database
$post1 = new Post("Adam", "This is a blog post");

$blog->insert($post1);


// there is nothing restricting the type of the "author" field, so we can make
// it a nested object
$author = array("name" => "Fred", "karma" => 42);
$post2 = new Post($author, "This is another blog post.");

// we create an extra field by setting the title
$post2->setTitle("Second Post");

$blog->insert($post2);

?>
]]>
    </programlisting>
   </para>

   <para>
    From the database shell, this will look something like:
    <programlisting role="shell">
<![CDATA[
> db.blog.find()
{ "_id" : ObjectId("4b06c263edb87a281e09dad8"), "author" : "Adam", "content" : "This is a blog post", "comments" : [ ], "date" : "Fri Nov 20 2009 11:22:59 GMT-0500 (EST)" }
{ "_id" : ObjectId("4b06c282edb87a281e09dad9"), "author" : { "name" : "Fred", "karma" : 42 }, "content" : "This is a blog post", "comments" : [ ], "date" : "Fri Nov 20 2009 11:23:30 GMT-0500 (EST)", "title" : "Second Post" }
]]>
    </programlisting>
   </para>

   <para>
    The driver will not detect reference loops in arrays and objects.  For
    example, this will give a fatal error:
    <programlisting role="php">
<![CDATA[
<?php

$collection->insert($GLOBALS);

?>
]]>
    </programlisting>
    <programlisting role="txt">
<![CDATA[
Fatal error: Nesting level too deep - recursive dependency?
]]>
    </programlisting>
    If you need to insert documents that may have recursive dependency, you have
    to check for it yourself before passing it to the driver.
   </para>
  </section>
 </partintro>
 &reference.mongo.mongoid;
 &reference.mongo.mongocode;
 &reference.mongo.mongodate;
 &reference.mongo.mongoregex;
 &reference.mongo.mongobindata;
 &reference.mongo.mongoint32;
 &reference.mongo.mongoint64;
 &reference.mongo.mongodbref;
 &reference.mongo.mongominkey;
 &reference.mongo.mongomaxkey;
 &reference.mongo.mongotimestamp;

</part>