An Introductory Guide to Data Science and Machine Learning/

...

Apache Spark and it's Components

In this lesson, you’ll learn about Apache Spark. It is part of the Hadoop Ecosystem and is famous for its in-memory processing of data and efficiency.

We'll cover the following...

- Apache Spark
- Apache Spark Workloads
- How Apache Spark Works?
- - Resilient Distributed Datasets
- Advantages of Apache Spark

Apache Spark

Spark was developed in 2019 at the University of California Berkeley. Apache Spark is an open-source and distributed processing system that is used for processing big data. It holds many advantages over Hadoop and one of them is very fast. It has the ability to utilize in-memory caching and query execution to retrieve results to queries in a quick manner. It is well suited for Machine Learning, Graph Analytics, Batch processing, and real-time processing. It provides API’s for famous programming languages like Java, Scala, Python, and R.

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Apache Spark and it's Components

Apache Spark

Apache Spark Workloads