Anatomy of a Spark Application

Read-up on the constituents of a Spark application.

Building blocks of a Spark application

In this lesson, we'll formally look at various components of a Spark application. A Spark application consists of one or several jobs. But a Spark job, unlike MapReduce, is much broader in scope.

Each job is made of a directed acyclic graph of stages. A stage is roughly equivalent to a map or reduce phase in MapReduce. A stage is split into tasks by the Spark runtime and is executed in parallel on partitions of an RDD across the cluster. A task can be thought of as the single unit of work or execution that is sent to the Spark executor. Each task maps to a single core and works on a single partition of data. The relationship among these various concepts is depicted below:

Get hands-on with 1200+ tech skills courses.