HACK: Creating triggers for MongoDB

14 05 2012

I visited the MongoDB conference in Berlin. At one talk about tips, tricks and hacks for MongoDB the speaker mentioned that there is a little hack which you can use to create a trigger for MongoDB. I wanted to try this out because he only mentioned how to do this theoretically very shortly.

When you have configured MongoDB to work as a replicaset you maybe have noticed that on the local database a new collection called “oplog.rs” is created. Inside this collection MongoDB stores all insert / update and delete operations which are executed against this replicaset (it’s comparable to the transaction log on a SQL Server). The oplog collection is used to distribute all the operations from the primary node to all secondary’s. With the help of this collection and a little javascript file we are able to create something which behaves like a trigger.

Let’s start with the oplog collection. If you look at an entry from this collection you can see something which can look similar to the following extract.

{
    "ts" : {
        "$timestamp" : NumberLong("5724119038133534721")
    },
    "h" : NumberLong("-7041921609633449468"),
    "op" : "i",
    "ns" : "TestApplication.BlogPost",
    "o" : {
        "_id" : ObjectId("4f7027f0df6e252390d2332a"),
        "Author" : "Test Author",
        "CreationDate" : new Date("Mon, 12 Mar 2012 00:00:00 GMT +01:00"),
        "Comment" : "My Comment"
    }
}

ts: is the timestamp. We need the timestamp to avoid that an element can be triggered twice.

op: is the operation. The interesting operations are “i” for insert / “u” for update and “d” for delete.

ns: is the namespace (database and collection) were the operation was executed.

o: is the object which is created or updated.

If you need more information about the oplog have a look at the following page about the oplog on the website from MongoDB:

http://www.mongodb.org/display/DOCS/Replica+Sets+-+Oplog

Now we create the javascript file. This script has a while loop without any option to exit this loop. We want to watch for all changes on the oplog and want to react on these changes. As long the script is running we have a behavior similar to a trigger.

Two features of MongoDB are used to allow the execution of this script (have a look at the links if you want to have further information):

Now have a look at the script and modify and reuse it if you like.

var coll = db.oplog.rs;
var lastTimeStamp = coll.find().sort({ '$natural' : -1 })[0].ts;

while(1){
    cursor = coll.find({ ts: { $gt: lastTimeStamp } });
    // tailable
    cursor.addOption( 2 );
    // await data
    cursor.addOption( 32 );

    while( cursor.hasNext() ){
        var doc = cursor.next();
        lastTimeStamp = doc.ts;
        printjson( doc );
    }
}

What the current script does is checking for operations inside the oplog and print out the oplog entry. Just change the line with the printjson command to the operation you want to perform as the result of the trigger. On the line where you initialize the cursor you can enhance the query if you maybe only want to react on update operations.

I developed on a project with MongoDB nearly 1.5 years now and I didn’t came across a problem were I really need a trigger. I saw a couple of people asking for triggers at different pages and hope I can help some of them with this little hack. This is not tested on a high traffic environments.

Advertisements




CSV export from MongoDB using PowerShell

7 02 2012

The tools to import and export data on a MongoDB instance are very powerful. I really like the tools because they are very easy to use. Some features would be nice to have, but you can reach a lot with the current set of tools and options.  Detailed information about the import and export tools can be found at the following address: http://www.mongodb.org/display/DOCS/Import+Export+Tools

Mongoexport offers an ability to export data to csv which you can easily read with Excel. This allows “normal” users to display, sort & filter data in their familiar environment. Especially for flat documents the csv export is a great option.

To export data from a collection you can use a command which is similar to this one:

mongoexport -d <databaseName> -c <collectionName> -f “<field1,field2>” --csv -o <outputFile>

But wait there is one thing which I don’t like about this. We must define the fields we want to export. When you use the “csv”-option for mongoexport, the field-options becomes required. But what can I do to avoid a hard coded list of fields? Especially on an environment where many changes will happen, you need a solution which works without a manual edited field list.

What we can do is a map/reduce to get all field names from every document inside a collection. With this result you are able to call mongoexport with a field list which is generated on the fly. Details about map/reduce for MongoDB can be found at the following address: http://www.mongodb.org/display/DOCS/MapReduce

The map/reduce can look like the following example:

function GetAllFields() {
    map = function(){
        for (var key in this) { emit(1, {"keys" : [ key ]}); }
    };

    reduce = function(key, values) {
        var resultArray = [],
            removeDuplicates = function (elements) {
                var result=[],
                    listOfElemets={};
                for (var i = 0, elemCount = elements.length;
                                                        i < elemCount; i++) {
                    for (var j = 0, keyCount = elements[i].keys.length;
                                                         j < keyCount; j++) {
                       listOfElemets[elements[i].keys[j]] = true;
                    }
                }
                for (var element in listOfElemets) {
                    result.push(element);
                }
            return result;
        }

        return { "keys" : removeDuplicates(values) };
    };

    retVal = db.<collectionName>.mapReduce(map, reduce, {out : {inline : 1}})
    print(retVal.results[0].value.keys);
}
GetAllFields();

This function can be stored inside a js-file. Now we need to execute the function. This can be done, for example with PowerShell and the help of mongo.exe. The result of the script execution is a comma separated field list with all field-names on the 1st -level from one collection. This is exactly the format we need for the csv export using mongoexport. Therefore we are ready to go and can call the export to a csv-file.

The following code shows a PowerShell script which retrieves the field-list and runs the export afterwards.

$fieldNames = (mongo.exe <server>/<database> <scriptFile> --quiet)
(mongoexport.exe -d <databaseName> -c <collectionName> -f $fieldNames --csv   -o <outputFile>)

Hope this will be useful for someone!