Transformations (II): FlatMap and Distinct

Get introduced to the second set of basic transformations.

We'll cover the following

FlatMap

The FlatMap operation is an old resident of the functional programming paradigm realm. It can be tricky to understand conceptually. There are two key components to learn about regarding FlatMap’s purposes:

  1. Being a map transformation in nature, it applies a function to each element of a collection. This is no different than the plain map() function described before.

  2. If the input is a collection of collections of elements (say a List of Lists, an array of arrays), it flattens the results into a single collection.

So fundamentally, objects are transformed in map and flatMap operations based on a function, but how the elements are processed differs. The former processes a single collection while the latter processes nested collections.

In Spark, the concept is similar, but it displays some differences, so let’s start by visualizing this graphically and then practicing it in the code example.

Get hands-on with 1200+ tech skills courses.