# PODNAME: MongoDB::DataTypes # ABSTRACT: The data types used with MongoDB __END__ =pod =head1 NAME MongoDB::DataTypes - The data types used with MongoDB =head1 VERSION version 0.503.2 =head1 DESCRIPTION This goes over the types you can save to the database and use for queries in the Perl driver. If you are using another language, please refer to that language's documentation (L). =head1 NOTES FOR SQL PROGRAMMERS =head2 You must query for data using the correct type. For example, it is perfectly valid to have some records where the field "foo" is 123 (integer) and other records where "foo" is "123" (string). Thus, you must query for the correct type. If you save C<{"foo" =E "123"}>, you cannot query for it with C<{"foo" =E 123}>. MongoDB is strict about types. If the type of a field is ambiguous and important to your application, you should document what you expect the application to send to the database and convert your data to those types before sending. There are some object-document mappers that will enforce certain types for certain fields for you. You generally shouldn't save numbers as strings, as they will behave like strings (e.g., range queries won't work correctly) and the data will take up more space. If you set L, the driver will automatically convert everything that looks like a number to a number before sending it to the database. Numbers are the only exception to the strict typing: all number types stored by MongoDB (32-bit integers, 64-bit integers, 64-bit floating point numbers) will match each other. =head1 TYPES =head2 Numbers By default, numbers with a decimal point will be saved as doubles (64-bit). =head3 32-bit Platforms Numbers without decimal points will be saved as 32-bit integers. To save a number as a 64-bit integer, use bigint: use bigint; $collection->insert({"user_id" => 28347197234178}) The driver will die if you try to insert a number beyond the signed 64-bit range: -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807. Numbers that are saved as 64-bit integers will be decoded as doubles. =head3 64-bit Platforms Numbers without a decimal point will be saved and returned as 64-bit integers. Note that there is no way to save a 32-bit int on a 64-bit machine. Keep in mind that this can cause some weirdness to ensue if some machines are 32-bit and others are 64-bit. Take the following example: =over 4 =item * Programmer 1 saves an int on a 32-bit platform. =item * Programmer 2 retrieves the document on a 64-bit platform and re-saves it, effectively converting it to a 64-bit int. =item * Programmer 1 retrieves the document on their 32-bit machine, which decodes the 64-bit int as a double. =back Nothing drastic, but good to be aware of. =head4 64-bit integers in the shell The Mongo shell has one numeric type: the 8-byte float. This means that it cannot always represent an 8-byte integer exactly. Thus, when you display a 64-bit integer in the shell, it will be wrapped in a subobject that indicates it might be an approximate value. For instance, if we run this Perl on a 64-bit machine: $coll->insert({_id => 1}); then look at it in the shell, we see: > db.whatever.findOne() { "_id" : { "floatApprox" : 1 } } This doesn't mean that we saved a float, it just means that the float value of a 64-bit integer may not be exact. =head4 Dealing with numbers and strings in Perl Perl is very flexible about whether something is number or a string: it generally infers the type from context. Unfortunately, the driver doesn't have any context when it has to choose how to serialize a variable. Therefore, the default behavior is to introspect the flags that are set on that variable and decide what the user meant, which are generally affected by the last operation. my $var = "4"; # stored as the string "4" $collection->insert({myVar => $var}); $var = int($var) if (int($var) eq $var); # stored as the int 4 $collection->insert({myVar => $var}); Because of this, users often find that they end up with more strings than they wanted in their database. If you would like to have everything that looks like a number saved as a number, set the L option. $MongoDB::BSON::looks_like_number = 1; my $var = "4"; # stored as the int 4 $collection->insert({myVar => $var}); This will send anything that "looks like" a number as a number. It can recognize anything that L's C function can recognize. On the other hand, sometimes there is data that looks like a number but should be saved as a string. For example, suppose we were storing zip codes. If we wanted to generally convert strings to numbers, we might have something like: $MongoDB::BSON::looks_like_number = 1; # zip is stored as an int: 4101 $collection->insert({city => "Portland", "zip" => "04101"}); To force a "number" to be saved as a string with aggressive number conversion on, bless the string as a C type: my $z = "04101"; my $zip = bless(\$z, "MongoDB::BSON::String"); # zip is stored as "04101" $collection->insert({city => "Portland", zip => bless(\$zip, "MongoDB::BSON::String")}); =head2 Strings All strings must be valid UTF-8 to be sent to the database. If a string is not valid, it will not be saved. If you need to save a non-UTF-8 string, you can save it as a binary blob (see the Binary Data section below). All strings returned from the database have the UTF-8 flag set. Unfortunately, due to Perl weirdness, UTF-8 is not very pretty. For example, suppose we have a UTF-8 string: my $str = 'Ă…land Islands'; Now, let's print it: print "$str\n"; You can see in the output: "\x{c5}land Islands" Lovely, isn't it? This is how Perl prints UTF-8. To make it "pretty," there are a couple options: my $pretty_str = utf8::encode($str); This, unintuitively, clears the UTF-8 flag. You can also just run binmode STDOUT, ':utf8'; and then the string (and all future UTF-8 strings) will print "correctly." You can also turn off C<$MongoDB::BSON::utf_flag_on>, and the UTF-8 flag will not be set when strings are decoded: $MongoDB::BSON::utf8_flag_on = 0; =head2 Arrays Arrays must be saved as array references (C<\@foo>, not C<@foo>). =head2 Embedded Documents Embedded documents are of the same form as top-level documents: either hash references or Ls. =head2 Dates The L package can be used insert and query for dates. Dates stored in the database will be returned as instances of DateTime. An example of storing and retrieving a date: use DateTime; my $now = DateTime->now; $collection->insert({'ts' => $now}); my $obj = $collection->find_one; print "Today is ".$obj->{'ts'}->ymd."\n"; An example of querying for a range of dates: my $start = DateTime->from_epoch( epoch => 100000 ); my $end = DateTime->from_epoch( epoch => 500000 ); my $cursor = $collection->query({event => {'$gt' => $start, '$lt' => $end}}); B objects is extremely slow. Consider saving dates as numbers and converting the numbers to Ls when needed. A single L field can make deserialization up to 10 times slower.> For example, you could use the L