Closing Remarks

Final thoughts.

The ecosystem of Big Data is still fledgling but, Spark stands out in its rapid adoption across enterprises in favor of the traditional Hadoop stack’s Map Reduce paradigm. If you have previously worked with MapReduce, you’ll appreciate Spark more, and understand better the pain points it addresses that were inherent in the Hadoop’s Map Reduce model. Spark is very versatile and works seamlessly with a variety of old and new Big Data technologies e.g. it can run on YARN and also use HDFS as storage, both of which come from the original Hadoop stack. Spark as an in-memory distributed processing engine enhances rather than replaces the capabilities of the modern Big Data technology stack.

The course is compact but comprehensive and intentionally avoids deep discussions about the internals and technical workings of Spark. The course does a robust coverage of the fundamentals of Spark with an aim to impart enough context and knowledge to the reader so as to set up the reader to independently learn and work with complex tasks and capabilities of Spark.

The reader is more than welcome and encouraged to share any feedback on this course. Thank you for your support.

Get hands-on with 1200+ tech skills courses.