Search⌘ K

Map-Reduce Pattern

Learn to apply the map-reduce pattern in Python for processing large data sets by mapping functions to iterables and reducing results into single outcomes. Understand how to filter and combine data efficiently, and explore functional programming techniques like map, filter, enumerate, and reduce to handle data streams without storing all data in memory.

What is the map-reduce pattern?

The map-reduce pattern is a way of processing large data sets in a way that can be distributed amongst many computers.

The basic idea is to start by processing data elements individually, and, finally, combine them to give the required result.

To give a simple example, suppose we want to calculate the average word length of the words in a block of text:

The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code – not in reams of trivial code that bore the reader to death.

We can break this task down into two steps:

  • Count the number of letters in each word.
  • Sum the total number of letters in all the words.

Dividing the sum by the number of words will give us our result, the average word length.

Here is a list of our words, with the punctuation removed:

strings = ['the', 'joy', 'of', 'coding', 'Python', 'should', 
'be', 'in', 'seeing', 'short', 'concise', 'readable', 'classes', 'that', 'express', 'a', 'lot', 'of', 'action', 'in', 'a', 'small',
...