CSV export from MongoDB using PowerShell

7 02 2012

The tools to import and export data on a MongoDB instance are very powerful. I really like the tools because they are very easy to use. Some features would be nice to have, but you can reach a lot with the current set of tools and options.  Detailed information about the import and export tools can be found at the following address: http://www.mongodb.org/display/DOCS/Import+Export+Tools

Mongoexport offers an ability to export data to csv which you can easily read with Excel. This allows “normal” users to display, sort & filter data in their familiar environment. Especially for flat documents the csv export is a great option.

To export data from a collection you can use a command which is similar to this one:

mongoexport -d <databaseName> -c <collectionName> -f “<field1,field2>” --csv -o <outputFile>

But wait there is one thing which I don’t like about this. We must define the fields we want to export. When you use the “csv”-option for mongoexport, the field-options becomes required. But what can I do to avoid a hard coded list of fields? Especially on an environment where many changes will happen, you need a solution which works without a manual edited field list.

What we can do is a map/reduce to get all field names from every document inside a collection. With this result you are able to call mongoexport with a field list which is generated on the fly. Details about map/reduce for MongoDB can be found at the following address: http://www.mongodb.org/display/DOCS/MapReduce

The map/reduce can look like the following example:

function GetAllFields() {
    map = function(){
        for (var key in this) { emit(1, {"keys" : [ key ]}); }
    };

    reduce = function(key, values) {
        var resultArray = [],
            removeDuplicates = function (elements) {
                var result=[],
                    listOfElemets={};
                for (var i = 0, elemCount = elements.length;
                                                        i < elemCount; i++) {
                    for (var j = 0, keyCount = elements[i].keys.length;
                                                         j < keyCount; j++) {
                       listOfElemets[elements[i].keys[j]] = true;
                    }
                }
                for (var element in listOfElemets) {
                    result.push(element);
                }
            return result;
        }

        return { "keys" : removeDuplicates(values) };
    };

    retVal = db.<collectionName>.mapReduce(map, reduce, {out : {inline : 1}})
    print(retVal.results[0].value.keys);
}
GetAllFields();

This function can be stored inside a js-file. Now we need to execute the function. This can be done, for example with PowerShell and the help of mongo.exe. The result of the script execution is a comma separated field list with all field-names on the 1st -level from one collection. This is exactly the format we need for the csv export using mongoexport. Therefore we are ready to go and can call the export to a csv-file.

The following code shows a PowerShell script which retrieves the field-list and runs the export afterwards.

$fieldNames = (mongo.exe <server>/<database> <scriptFile> --quiet)
(mongoexport.exe -d <databaseName> -c <collectionName> -f $fieldNames --csv   -o <outputFile>)

Hope this will be useful for someone!

Advertisements

Actions

Information

2 responses

5 03 2012
Daniel

I have updated the script to be idempotent. The fact that the map / reduce engine may invoke reduce functions iteratively results in duplicates in the result set.

27 02 2013
rubytuesdaydono

very helpful and creative – leveraging mongodb’s strengths to your advantage. thank you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




%d bloggers like this: