Introduction to Big Data Processing Systems
Discover the fundamentals of big data processing by examining key systems such as MapReduce, Spark, and Kafka. Learn how these systems address large-scale data challenges by enabling efficient batch processing, low-latency operations, and real-time data streaming. Understand their use cases, trade-offs, and the underlying principles that support modern distributed data processing.
Motivation
It might not be an understatement to say that data runs our world. From calculating accurate travel times for a map allocation by taking dynamic traffic information into account to personalized recommendations for pretty much all the services, such as shopping, list of songs, etc., it is data that needs to be harnessed to get the right information.
What we will learn
We have selected three big data processing papers to discuss in the following few chapters:
-
[MapReduce] Dean, Jeffrey, and Sanjay Ghemawat. MapReduce: simplified data processing on large clusters. OSDI’04: Sixth Symposium on Operating System Design and Implementation (2008): pp. 137-150. ...