HACK: Creating triggers for MongoDB

14 05 2012

I visited the MongoDB conference in Berlin. At one talk about tips, tricks and hacks for MongoDB the speaker mentioned that there is a little hack which you can use to create a trigger for MongoDB. I wanted to try this out because he only mentioned how to do this theoretically very shortly.

When you have configured MongoDB to work as a replicaset you maybe have noticed that on the local database a new collection called “oplog.rs” is created. Inside this collection MongoDB stores all insert / update and delete operations which are executed against this replicaset (it’s comparable to the transaction log on a SQL Server). The oplog collection is used to distribute all the operations from the primary node to all secondary’s. With the help of this collection and a little javascript file we are able to create something which behaves like a trigger.

Let’s start with the oplog collection. If you look at an entry from this collection you can see something which can look similar to the following extract.

{
    "ts" : {
        "$timestamp" : NumberLong("5724119038133534721")
    },
    "h" : NumberLong("-7041921609633449468"),
    "op" : "i",
    "ns" : "TestApplication.BlogPost",
    "o" : {
        "_id" : ObjectId("4f7027f0df6e252390d2332a"),
        "Author" : "Test Author",
        "CreationDate" : new Date("Mon, 12 Mar 2012 00:00:00 GMT +01:00"),
        "Comment" : "My Comment"
    }
}

ts: is the timestamp. We need the timestamp to avoid that an element can be triggered twice.

op: is the operation. The interesting operations are “i” for insert / “u” for update and “d” for delete.

ns: is the namespace (database and collection) were the operation was executed.

o: is the object which is created or updated.

If you need more information about the oplog have a look at the following page about the oplog on the website from MongoDB:

http://www.mongodb.org/display/DOCS/Replica+Sets+-+Oplog

Now we create the javascript file. This script has a while loop without any option to exit this loop. We want to watch for all changes on the oplog and want to react on these changes. As long the script is running we have a behavior similar to a trigger.

Two features of MongoDB are used to allow the execution of this script (have a look at the links if you want to have further information):

Now have a look at the script and modify and reuse it if you like.

var coll = db.oplog.rs;
var lastTimeStamp = coll.find().sort({ '$natural' : -1 })[0].ts;

while(1){
    cursor = coll.find({ ts: { $gt: lastTimeStamp } });
    // tailable
    cursor.addOption( 2 );
    // await data
    cursor.addOption( 32 );

    while( cursor.hasNext() ){
        var doc = cursor.next();
        lastTimeStamp = doc.ts;
        printjson( doc );
    }
}

What the current script does is checking for operations inside the oplog and print out the oplog entry. Just change the line with the printjson command to the operation you want to perform as the result of the trigger. On the line where you initialize the cursor you can enhance the query if you maybe only want to react on update operations.

I developed on a project with MongoDB nearly 1.5 years now and I didn’t came across a problem were I really need a trigger. I saw a couple of people asking for triggers at different pages and hope I can help some of them with this little hack. This is not tested on a high traffic environments.

Advertisements




Thoughts about Replica Set configuration with MongoDB

2 03 2012

In my last post I provided you with a setup script which can be used to get a simple replica set configuration up and running. Now I want to talk about some details of the script and why I have created the script like it is.

Why should I use the option notablescan?

In my opinion this is a flag which should be set on every development environment. When you write new functionality inside your data access it can happen that you forget to update the indices on the database. This will result in bad performance on queries. Especially on applications with a lot of traffic, this will result in big performance issues.

To avoid this problem enable the notablescan option on your development environment. Every time when a query has to run over the complete table to fetch data (because of a missing index) you will receive an exception similar to the following:

image

When you set this option on your development environment the risk to deploy code to your live systems, without a correct index, is reduced to a minimum.

Priority for Replica Set nodes

While setting up a replica set, you can provide every node with a priority thru the configuration. The priority is used to rate a single node as the primary. A higher priority will result in a higher chance to be rated as primary. The priority of 0 excludes a node from becoming a primary. This is useful to exclude nodes with bad performance to be rated as primary. With version 2.0.2 of MongoDB (Windows) you can specify a priority from 0.0 to 100.0; inside the script I want to achieve that the node with the smallest port number will be the primary. The configuration for the first node starts with the priority of 100. Every node receives a priority which is decreased by 1. Therefore we make sure that the node with the smallest port number has the highest priority and will be rated as primary. For the reason the first node fail to start, the node with the second highest priority will take over.

Reinstallation of Replica Sets; but what happened to my data?

As I mentioned in my last post about the setup script we remove the old service (if some exists with the same name) and install everything new. For the reason we want to keep the data inside the database we need to do some things to achieve this.

If we used the script to install the replica set and created a database called “MyTestDb”, we should have a folder structure on the file system which should look like in the following picture.

image

Now we want to reinstall the instance with another configuration and keep all the existing data. On every node, expect the node with the smallest port number; we make sure that none of the data folders hold any data or folders. In this case the “replSet2” and the “arbiter” folder are emptied completely. We need to do this because when installing a replica set it’s not allowed to have data inside any node except form the node where you initiate the configuration.

Inside the “replSet1” folder we only need to delete the content of the local folder. The content of the “MyTestDb” folder isn’t touched. Why do we need to do this? The configuration of the replica set is stored inside the local database. If we don’t delete content of the local folder, we can’t run the initiate method; we can only use the reconfigure options. To avoid a differentiation between a new installation and a reinstall, I decided to implement the script to use a remove with a completely new initiation process.

After the new installation the replica set should come up with the new configuration. On the startup the replica set will start to sync all data from the “replSet1” folder to all other nodes. All data from the “MyTestDb” folder are synced. When the replication process is finished, you have a completely new configured replica set with the complete content from the old configuration.

I hope this information help some of you. If you have any questions about this or the setup script just let me know.

Cheers,

Daniel





Setup MongoDB as a service with Powershell

27 02 2012

I have written a small Powershell script to setup a MongoDB instance as a single node or in a replicaset configuration. I want to share the script here and hope this is useful for some of you (feedback is welcome). To run the script you need admin rights; otherwise the service can’t be created. The main purpose of the script is to get MongoDB up and running on a Windows PC for local development.

In this post I want to talk about how the script works. In a further post I will provide some information regarding the setup and configuration.

I have separated the script into 3 files. MongoDbSetup.ps1, WebClient.ps1 and Zip.ps1. The MongoDbSetup-file is the main script and responsible for the installation. WebClient and Zip are only small helpers. WebClient.ps1 is used to download a file and display a progress bar (while downloading). The file Zip.ps1 is used to unzip a zip-file to a specified destination folder (used to unpack MongoDB after download).

What the MongoDB setup script do

The following picture shows a simplified process what the script does.

image

I want to provide you with a bit more details about the script execution. The first thing we do is to setup the folder structure we expect inside the script. We have a download folder where the downloaded binaries from MongoDB are stored (zip-files). Every installed instance has his own folder in this case “MongoDB ReplicaSet”. Inside the “MongoDB ReplicaSet” directory we have 3 folders. “Bin” for storing the unzipped MongoDB binaries, “data” for the database files and “log” for all log messages.

image

The location and name where the folders get stored can be defined as a parameter while calling the script. Before we start to download the zip-file holding the MongoDB binaries we want to make sure there is no service running with the same name like the service we want to create. Therefore the script shuts down existing services to allow replacing the service thru the script. This can be handy for the reason you want to update an existing instance (for example you want to update the MongoDB version). Then the download of the zip-file with the binaries starts. The download will fetch the Windows 64-bit version of the binaries for the specified MongoDB version (tested with version 2.0.2 and 1.8.5). If the format of the filename on the MongoDB server will change you need to update the script. If you install a second node the download won’t fetch the file from the server as long you have the zip file inside the download folder (which is created thru the installation process).

The next step is to unzip and copy the executables to the bin folder for the new instance. The fact that we have a bin folder for every instance, made it possible to run different MongoDB versions on the different instances.

Now all preparation is done. We can start with the installation of the single node or the replicaset. I will describe the process of the replicaset installation because it’s much more interesting. The single node installation is a sub-part of the replicaset configuration.

For the reason we want to make changes on an existing replicaset with the script, we need to remove the existing instances with the same name. Afterwards we can setup the number of nodes. The amount of nodes is provided thru a parameter when calling the script. After the setup of the nodes another node is installed as arbiter, if you didn’t change the default configuration. Now we have all nodes installed and need to start the services.

The last step is to configure of the replicaset. We create a file which holds the configuration for all nodes. After creating the file we can run mongo.exe and provide the file as parameter to run and initiate the replicaset configuration. The replicaset needs a bit of time till it is up and running. Connect to your newly created instance and check the replicaset status by calling rs.status(). Then you are done.

As I mentioned above there are a couple of parameters you can set by calling the script to override the default values. In the following table you can find a list with these parameters.

Parameter Default Usage
version 2.0.2 Specify the version of the MongoDB binaries we want to use for the new instance.
Mode ReplicaSet Options are ReplicaSet and SingleMode. Depends on the instance you want to install.
portNumber 30000 Start the port number at the given port. On nodes for a replicaset the port is increased for every node.
numberNodes 2 Number of nodes (without arbiter)
useArbiter True Create and use an arbiter
destinationPath c:\mongodb\ Path where the installation stores the data
serviceName MongoDB ReplicaSet Name of the service which is created. When creating a replicaset a number is attached to the name.

I have uploaded the scripts on GitHub; use it on your own risk 🙂
The repository is located at: https://github.com/danielweberonline/MongoDB-Setup

Cheers,

Daniel