Trusted answers to developer questions


Educative Answers Team

Grokking the Behavioral Interview

Get Educative’s popular interview prep course for free.

MapReduce is a framework developed by Google to handle large amounts of data in a timely and efficient manner. One of the most famous software frameworks that uses MapReduce is Apache Hadoop MapReduce.


MapReduce takes advantage of numerous servers where data can be distributed and managed. Like every good framework, MapReduce provides abstractions to underlying processes happening during the execution of user commands. A few of these processes include fault tolerance, partitioning data, and aggregating data. The abstractions let the user focus on the high-level logic of the program while trusting that the framework will smoothly continue the processes under-the-hood.

How it works

The workflow that MapReduce follows is:

  • Partitioning
  • Map
  • Intermediate Files
  • Reduce
  • Aggregate
svg viewer

There are several Map Workers and Reduce Workers, but there is only one Master Node. The Master Node tells the Map and Reduce Workers what to do.


The data is usually in the form of a big chunk. It is necessary to, first, partition the data into smaller, more manageable pieces that can be efficiently handled by the map workers.

svg viewer


Map Workers receive the data in the form of a <key, value> (key is filename and value is content) pair. This data is processed by the Map Workers, according to the user-defined Map Function, to generate intermediate <key, value> pairs.

svg viewer

Intermediate Files

The data is partitioned into R partitions (R is the number of Reduce Workers). These files are buffered in the memory until the Master Node forwards them to the Reduce Workers.

svg viewer


As soon as the Reduce Workers get the data stored in the buffer, they sort it accordingly and group data with the same keys.

svg viewer


The Master Node is notified when the Reduce Workers are done with their tasks. In the end, the sorted data is aggregated together and R output files are generated for the user.


big data
Copyright ©2023 Educative, Inc. All rights reserved
Trusted Answers to Developer Questions

Related Tags

big data
Keep Exploring
Related Courses