Fork me on GitHub

MongoDB Cookbook MongoDB Cookbook

Pivot Data with Map reduce

Credit: Gaetan Voyer-Perrault

Problem

You have a collection of Actors with an array of the Movies they've done.

You want to generate a collection of Movies with an array of Actors in each.

Some sample data

  db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
  db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });

Solution

We need to loop through each movie in the Actor document and emit each Movie individually.

The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build an Actors array inside of the "value" document that is returned.

The code

map = function() {
  for(var i in this.movies){
    key = { movie: this.movies[i] };
    value = { actors: [ this.actor ] };
    emit(key, value);
  }
}

reduce = function(key, values) {
  actor_list = { actors: [] };
  for(var i in values) {
    actor_list.actors = values[i].actors.concat(actor_list.actors);
  }
  return actor_list;
}

Notice how actor_list is actually a javascript object that contains an array. Also notice that map emits the same structure.

Run the following to execute the map / reduce, output it to the "pivot" collection and print the result:

  printjson(db.actors.mapReduce(map, reduce, "pivot"));
  db.pivot.find().forEach(printjson);

Here is the sample output, note that "Pretty Woman" and "Runaway Bride" have both "Richard Gere" and "Julia Roberts".

{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } }
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } }
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
blog comments powered by Disqus