Persisting and fetching DateTime values with MongoDB

28 03 2012

The C# driver for MongoDB serializes data by default from C# objects to a bson representation which is stored inside MongoDB. The DateTime type is a bit special which I want to demonstrate with the following test.


public abstract class RepositorySubjectwhere T : Repository
{
    public static T Subject { get; set; }

    public RepositorySubject()
    {
        var mongoDb = new MongoDB();
        Subject = (T)Activator.CreateInstance(typeof(T), mongoDb);
    }
}


public static class BlogRepositorySpecs
{
    [Subject(typeof(BlogRepository))]
    public class When_refetching_persisted_data : RepositorySubject
    {
        private static BlogPost blogEntry;

        private static BlogPost result;

        Establish context = () =>
        {
            blogEntry = new BlogPost()
            {
                Author = "Test Author",
                Comment = "My Comment",
                CreationDate = new DateTime(2012, 4, 12),
                Id = ObjectId.GenerateNewId()
            };
        };

        Because of = () =>
        {
            Subject.Save(blogEntry);
            result = Subject.FindById(blogEntry.Id.ToString());
        };

        Cleanup after = () => Subject.Drop();

        It should_have_the_correct_creationdate = () => {
           result.CreationDate.ShouldEqual(blogEntry.CreationDate);
        }

        It should_have_the_correct_author = () => {
            result.Author.ShouldEqual(blogEntry.Author);
        }

        It should_have_the_correct_comment = () => {
            result.Comment.ShouldEqual(blogEntry.Comment);
        }
    }
}

This simple test is written with mspec. What the tests does is create a BlogPost-object with a DateTime value and 2 string values. This object is persisted inside MongoDB. Afterwards the object is retrieved and the expected data is compared to the retrieved data.
The abstract RepositorySubject is a helper which can be used for different mspec tests which are written to test different repository functionality.
The Repository class by itself is an abstract class holding the most important CRUD-operations for the usage with MongoDB. BlogRepository is inherited from Repository. Additional implementations can be added here. Inside the MongoDB class the connection to the MongoDB database is established and the commands are executed.

Running the Test

When we run the test on a machine where the current time is set to something else than UTC time the test will fail. Why will the test fail? DateTime values are stored as UTC time inside the database. When we retrieve the data from the database the DateTime value is fetched as a UTC DateTime value. Therefore we compare a local DateTime with a UTC DateTime value.

Set Serialization Options for DateTimes

What can we do to fix this problem? We can register a serialization option for the DateTime value. With this option we can specify that the value will be converted to the local DateTime when we retrieve the object from the database. The following code snippet shows how to transfer the CreationDate from UTC to the local DateTime value.


public class RegisterSerializer
{
    public void Setup()
    {
        BsonClassMap.RegisterClassMap(cm => {
            cm.AutoMap();
            cm.GetMemberMap(c => c.CreationDate).SetSerializationOptions(
                    new DateTimeSerializationOptions(DateTimeKind.Local));
        });
     }
 }

After adding this class we only need to call the method which can be done inside the RepositorySubject by adding the following 2 lines of code.


var registerSerializer = new RegisterSerializer();
registerSerializer.Setup();

Now we can run the tests again. The result should be as expected – green.





Thoughts about Replica Set configuration with MongoDB

2 03 2012

In my last post I provided you with a setup script which can be used to get a simple replica set configuration up and running. Now I want to talk about some details of the script and why I have created the script like it is.

Why should I use the option notablescan?

In my opinion this is a flag which should be set on every development environment. When you write new functionality inside your data access it can happen that you forget to update the indices on the database. This will result in bad performance on queries. Especially on applications with a lot of traffic, this will result in big performance issues.

To avoid this problem enable the notablescan option on your development environment. Every time when a query has to run over the complete table to fetch data (because of a missing index) you will receive an exception similar to the following:

image

When you set this option on your development environment the risk to deploy code to your live systems, without a correct index, is reduced to a minimum.

Priority for Replica Set nodes

While setting up a replica set, you can provide every node with a priority thru the configuration. The priority is used to rate a single node as the primary. A higher priority will result in a higher chance to be rated as primary. The priority of 0 excludes a node from becoming a primary. This is useful to exclude nodes with bad performance to be rated as primary. With version 2.0.2 of MongoDB (Windows) you can specify a priority from 0.0 to 100.0; inside the script I want to achieve that the node with the smallest port number will be the primary. The configuration for the first node starts with the priority of 100. Every node receives a priority which is decreased by 1. Therefore we make sure that the node with the smallest port number has the highest priority and will be rated as primary. For the reason the first node fail to start, the node with the second highest priority will take over.

Reinstallation of Replica Sets; but what happened to my data?

As I mentioned in my last post about the setup script we remove the old service (if some exists with the same name) and install everything new. For the reason we want to keep the data inside the database we need to do some things to achieve this.

If we used the script to install the replica set and created a database called “MyTestDb”, we should have a folder structure on the file system which should look like in the following picture.

image

Now we want to reinstall the instance with another configuration and keep all the existing data. On every node, expect the node with the smallest port number; we make sure that none of the data folders hold any data or folders. In this case the “replSet2” and the “arbiter” folder are emptied completely. We need to do this because when installing a replica set it’s not allowed to have data inside any node except form the node where you initiate the configuration.

Inside the “replSet1” folder we only need to delete the content of the local folder. The content of the “MyTestDb” folder isn’t touched. Why do we need to do this? The configuration of the replica set is stored inside the local database. If we don’t delete content of the local folder, we can’t run the initiate method; we can only use the reconfigure options. To avoid a differentiation between a new installation and a reinstall, I decided to implement the script to use a remove with a completely new initiation process.

After the new installation the replica set should come up with the new configuration. On the startup the replica set will start to sync all data from the “replSet1” folder to all other nodes. All data from the “MyTestDb” folder are synced. When the replication process is finished, you have a completely new configured replica set with the complete content from the old configuration.

I hope this information help some of you. If you have any questions about this or the setup script just let me know.

Cheers,

Daniel