...

Build the Data Processing Pipeline

Create a new mix project to demonstrate the ability of GenStage to build data processing pipelines.

We'll cover the following...

Create our mix project
The dummy function
Create a producer

The init function
The handle_demand callback

Create a consumer

The init function
The handle_events callback

Subscribe at runtime
Add the producer and consumer to our application
The playground

Complex use cases may require a data processing pipeline with a consumer stage, one or more producers, and several producer-consumers in between. However, the main principles stay the same. Therefore, we’ll start with a two-stage pipeline first and demonstrate how that works.

We will build a fake service that scrapes data from web pages—normally an intensive task dependent on system resources and a reliable network connection. Our goal is to request a number of URLs to be scraped and have the data pipeline take care of the workload.

Create our `mix` project

First, we’ll create a new application with a supervision tree, as we’ve done before. We will name it scraper and pretend we’re going to scrape data from web pages. We can see this below:

mix new scraper --sup

We have already created an application at the backend for you, so there’s no need to run the command above. This will create a project, scraper. We have added gen_stage as a dependency to mix.exs:

Press + to interact

Getting Started

Easy Concurrency with the Task Module

Long-Running Processes Using GenServer

Create a GenServer and Supervisor From Scratch

Data Processing Pipelines with GenStage

Process Collections with Flow

Data Ingestion Pipelines with Broadway

Concluding the Course

Build the Data Processing Pipeline

Create our `mix` project

The dummy function

Create a GenServer and Supervisor From Scratch

Build the Data Processing Pipeline

Create our mix project

The dummy function

Create our `mix` project