Spark Fundamentals

Explore the fundamentals of Apache Spark, including its architecture, in-memory computing, and advantages over MapReduce. Learn how Spark processes data in parallel across clusters to achieve faster big data processing and gain insight into its core components and use cases for scalable cloud and batch applications.

We'll cover the following...

Why choose Spark?
What is Spark?
A brief history of Spark
- Enter MapReduce
- Spark arrives
A comprehensive stack

Why choose Spark?

As the demand to process data and generate information continues to grow, engineers and data scientists are increasingly searching for easy and flexible tools to carry out parallel data analysis. This becomes even more apparent with the dawn of cloud computing, where processing power and horizontal scaling are more available.

Spark comes into this picture as one such tool due to the following principal reasons:

Ease of use: Spark is straightforward to use in comparison to other existing tools that pre-date it, such as Hadoop with MapReduce engine. It enables developers to focus on the logic of computation while they code on high-level APIs. It can also be installed and used on a simple laptop.

...

1.Course Introduction

2.Spark Introduction and Basics

3.Getting Started with Spark

4.DataFrame Basic Operations

5.DataFrame Advanced Operations

6.Spark SQL and Other Functionalities

7.Building a Big Data Batch Application

8.Deployment and Cluster Execution

9.Monitoring and Performance Fundamentals

10.Conclusion

11.Apendix

Spark Fundamentals

Why choose Spark?