...

/

Spark Differentiation

Spark Differentiation

Learn what factors and features differentiate the Spark framework from other processing engines.

Spark can be defined as a unified engine for processing large-scale distributed data on-premise in the data center or in the cloud. Some of the key characteristics and differentiators of the Spark framework are as follows:

  • Speed: Spark takes advantage of hardware advances, multithreading, and multiprocessing to execute workloads efficiently. Spark creates a directed acyclic graph (DAG) for computing a query. The graph can be decomposed into tasks that can be executed in parallel across the cluster. The physical execution engine of Spark, known as Tungsten, uses whole-stage code generation for compact code execution. We’ll study these concepts in later lessons. Spark stores intermediate results in memory instead of writing them to disk, which limits disk I/O and boosts ...