Thoughts about Replica Set configuration with MongoDB

2 03 2012

In my last post I provided you with a setup script which can be used to get a simple replica set configuration up and running. Now I want to talk about some details of the script and why I have created the script like it is.

Why should I use the option notablescan?

In my opinion this is a flag which should be set on every development environment. When you write new functionality inside your data access it can happen that you forget to update the indices on the database. This will result in bad performance on queries. Especially on applications with a lot of traffic, this will result in big performance issues.

To avoid this problem enable the notablescan option on your development environment. Every time when a query has to run over the complete table to fetch data (because of a missing index) you will receive an exception similar to the following:

image

When you set this option on your development environment the risk to deploy code to your live systems, without a correct index, is reduced to a minimum.

Priority for Replica Set nodes

While setting up a replica set, you can provide every node with a priority thru the configuration. The priority is used to rate a single node as the primary. A higher priority will result in a higher chance to be rated as primary. The priority of 0 excludes a node from becoming a primary. This is useful to exclude nodes with bad performance to be rated as primary. With version 2.0.2 of MongoDB (Windows) you can specify a priority from 0.0 to 100.0; inside the script I want to achieve that the node with the smallest port number will be the primary. The configuration for the first node starts with the priority of 100. Every node receives a priority which is decreased by 1. Therefore we make sure that the node with the smallest port number has the highest priority and will be rated as primary. For the reason the first node fail to start, the node with the second highest priority will take over.

Reinstallation of Replica Sets; but what happened to my data?

As I mentioned in my last post about the setup script we remove the old service (if some exists with the same name) and install everything new. For the reason we want to keep the data inside the database we need to do some things to achieve this.

If we used the script to install the replica set and created a database called “MyTestDb”, we should have a folder structure on the file system which should look like in the following picture.

image

Now we want to reinstall the instance with another configuration and keep all the existing data. On every node, expect the node with the smallest port number; we make sure that none of the data folders hold any data or folders. In this case the “replSet2” and the “arbiter” folder are emptied completely. We need to do this because when installing a replica set it’s not allowed to have data inside any node except form the node where you initiate the configuration.

Inside the “replSet1” folder we only need to delete the content of the local folder. The content of the “MyTestDb” folder isn’t touched. Why do we need to do this? The configuration of the replica set is stored inside the local database. If we don’t delete content of the local folder, we can’t run the initiate method; we can only use the reconfigure options. To avoid a differentiation between a new installation and a reinstall, I decided to implement the script to use a remove with a completely new initiation process.

After the new installation the replica set should come up with the new configuration. On the startup the replica set will start to sync all data from the “replSet1” folder to all other nodes. All data from the “MyTestDb” folder are synced. When the replication process is finished, you have a completely new configured replica set with the complete content from the old configuration.

I hope this information help some of you. If you have any questions about this or the setup script just let me know.

Cheers,

Daniel





Problem using $rename on indexed fields using MongoDB

1 02 2012

Today we found a problem which occurs on MongoDB when you rename an indexed field. This problem occurs on version 2.0.2. I didn’t test the problem on another version.

If you have simple documents having the following structure:

{
    PreName : “Daniel”,
    Name: “Weber”
}

You maybe want to rename all “Name” elements to “LastName”. To do this you can use the rename functionality from MongoDB
http://www.mongodb.org/display/DOCS/Updating#Updating-%24rename

The command for a rename should be something like this:

db.MyCollection.update( { } , { $rename : { "Name " : "LastName" } } )

When you look at the documents after running the rename query everything is fine, the field name is renamed correctly.

For the reason you run the rename command again, nothing happen (as expected) because a field with “Name” doesn’t exist anymore.

The problem occurs only when you have an index on the field you want to rename. Therefore we create the same simple document and insert an index on the “Name” property with the following command:

db.MyCollection.ensureIndex( { "Name" : 1 } )

Information about index creation can be found on
http://www.mongodb.org/display/DOCS/Indexes#Indexes-Basics

Now we run the same update command. When we have a look into the database the field name changed as expected. Then we run the rename command again. A strange thing happened. The renamed field is deleted and the “LastName” value is lost.

Therefore be careful with renames of indexed elements where a script runs the rename more than once.