Group and Sort

Explore how to use Elixir's Flow library to group and sort large datasets concurrently. Learn methods like Flow.group_by, Flow.take_sort, and how to optimize processing pipelines. Understand how these tools help you organize data effectively and improve performance in concurrent data workflows.

We'll cover the following...

Group
Sort
Run a Flow
The playground

C++

#file path -> airports/lib/airports.ex
def open_airports() do
  airports_csv()
  |> File.stream!()
  |> Flow.from_enumerable()
  |> Flow.map(fn row ->
    [row] = CSV.parse_string(row, skip_headers: false)
    %{
      id: Enum.at(row, 0),
      type: Enum.at(row, 2),
      name: Enum.at(row, 3),
      country: Enum.at(row, 8)
    }
  end)
  |> Flow.reject(&(&1.type == "closed"))
  |> Flow.partition(key: {:key, :country})
  |> Flow.group_by(& &1.country)
  |> Flow.map(fn {country, data} -> {country, Enum.count(data)} end)
  |> Enum.to_list()
end

1.Getting Started

2.Easy Concurrency with the Task Module

3.Long-Running Processes Using GenServer

Project

4.Data Processing Pipelines with GenStage

5.Process Collections with Flow

6.Data Ingestion Pipelines with Broadway

7.Concluding the Course

Group and Sort

Group