Spark SQL Engine

Get an introduction to the Spark SQL engine and its two sub-components, Tungsten Project and Catalyst optimizer.

We'll cover the following...

Overview
Components of Spark SQL

Overview

Spark SQL allows developers to programmatically issue ANSI SQL:2003–compatible queries on structured data with a schema. Spark SQL was introduced in version 1.3. Since then, several higher-level functionalities have been built upon it. Some of these are:

Generates optimized query plans and the final execution of compact JVM code.
Serves as a bridge to external tools using database ODBC/JDBC connectors.
Adds the ability to read and write structured files in various formats like JSON, CSV, or Avro and convert them into temporary tables.
Connects to the Apache Hive metastore and tables.
Introduces an interactive Spark ...

Spark Overview

DataFrames

Datasets

Spark SQL

Summary

Spark SQL Engine

Overview