Route Events with Partitions
Explore how to manage event routing in concurrent Elixir data pipelines by adding partition layers. Understand Flow's partition function to group items by key, ensuring accurate reductions without duplicates while maintaining concurrency.
We'll cover the following...
Problems with Flow.reduce
As we know, Enum.reduce/3 works within a single process. With Flow.reduce/3, we have many processes where each one continuously receives batches of items to work on. This enables Flow to balance the workload and handle back-pressure. However, items that should count together can end up being consumed by more than one process, so we get duplicates when all results are combined in the end. This figure illustrates the problem:
Let’s use the :stages option to reduce concurrency to 1 when we call Flow.from_enumerable/1. We can do this in ...