GenStage-powered Pipeline

Learn how to build a GenStage-powered data pipeline.

We'll cover the following

Using GenStage

GenStage helps us write a data pipeline that can exchange data from producers to consumers. GenStage is not an out-of-the-box data pipeline. Instead, it provides a specification on passing data, which we can then implement in our application’s data pipeline.

GenStage provides two main stage types that are used to model our pipeline:

  • The producer coordinates the fetching of data items and then passes to the following consumer stage. Producers can fetch data from a database or keep it in memory. In this chapter, our data pipeline will be entirely in memory.

  • The consumer asks for and receives data items from the previous producer stage. These items are then processed by our code before more items are received.

We model our pipeline in a very sequential way. We start with a producer stage that is connected to a consumer stage. We could continue to link together as many stages as needed to model our particular data pipeline—a consumer can also be a producer to other consumers. We’ll use the most direct pipeline possible with only one producer and one consumer stage.

Let’s jump right into building a data pipeline. The pipeline that we’ll end up with at the end of this chapter is generic and can be used for many cases. I often start with the same base configuration and add as necessary. Here’s what we’ll be building:

Get hands-on with 1200+ tech skills courses.