Build the Data Processing Pipeline
Create a new mix project to demonstrate the ability of GenStage to build data processing pipelines.
Complex use cases may require a data processing pipeline with a consumer stage, one or more producers, and several producer-consumers in between. However, the main principles stay the same. Therefore, we’ll start with a two-stage pipeline first and demonstrate how that works.
We will build a fake service that scrapes data from web pages—normally an intensive task dependent on system resources and a reliable network connection. Our goal is to request a number of URLs to be scraped and have the data pipeline take care of the workload.
Create our mix
project
First, we’ll create a new application with a supervision tree, as we’ve done before. We will name it scraper
and pretend we’re going to scrape data from web pages. We can see this below:
mix new scraper --sup
We have already created an application at the backend for you, so there’s no need to run the command above. This will create a project, scraper
. We have added gen_stage
as a dependency to mix.exs
:
Get hands-on with 1400+ tech skills courses.