Introduction to the Flow Library

The Enum and Stream modules

Since Elixir is a functional language and all data is immutable, most Elixir developers quickly get accustomed to using functions like map, filter, and reduce on a daily basis. These and other data-processing functions, found in the Enum and Stream modules, are essential to functional programming and help us transform data in various ways.

The limitations of Enum and Stream

However, as the amount of data we have to process grows, so does the time it takes to finish the work. We already have a few tools at our disposal to run code concurrently, but implementing frequently used functions like reduce and group_by in parallel is going to be challenging. Thankfully, there is already a solution available to us on the Hex registry.

Enter Flow

In this chapter, we’ll learn about Flow—a powerful library with a simple API that makes processing large collections of data a breeze. The Flow library uses GenStage under the hood, so all operations will run parallel in separate stage processes and take care of back-pressure for us.

Compare Flow with Enum and Stream

First, we’ll introduce Flow and comparE it to the commonly used Enum and Stream modules by analyzing airport data from around the world. We’ll see how easy it is to convert existing code to run concurrently and work with large datasets. Then, we’ll look at how to run reduce operations concurrently and handle infinite and slow streams of data. Finally, we’ll revisit the scraper project and integrate Flow with an already running GenStage pipeline. This will give us some extra flexibility when solving problems. Let’s get started!

Create a new mix project

Before we begin, let’s scaffold a new project to work on. We’ll build a simple utility to help us analyze airport data by country. We’ll see which countries and territories have the largest number of working airports globally. Let’s call this application airports:

 mix new airports

Include flow as a dependency

Now, let’s edit mix.exs to add flow as a dependency. We’ll also need a CSV parser. Therefore, we change our dependencies list to the following:

Get hands-on with 1200+ tech skills courses.