Motivation

Databases are a critical component of most real-world systems. Systems moved from traditional relational databases to NoSQL solutions to meet the evolving scalability and availability needs. They often paid the price in terms of relinquishing strong consistency or lower performance. It has been the goal of the database community to have the best of both worlds (strong consistency, high performance of relational databases, and scalability and availability of NoSQL databases). While we have ways to go towards that end, research and development over the last decades have brought us closer to that goal.

What we will learn

We’ve selected the following three papers to discuss in the next few chapters:

[Bigtable] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes and Robert E. Gruberng. 2003. ...

Prologue

File Systems

Google File System (GFS)

Google Colossus File System

Facebook's Tectonic File System

Databases

Google Bigtable

Google Megastore

Google Spanner

Key-value Stores

Many-core Key-value Store

Scaling Memcache

SILT

Amazon DynamoDB

Concurrency Management

Two-phase Locking (2PL)

Google Chubby Locking Service

ZooKeeper

Big Data Processing: Batch to Stream Processing

MapReduce

Spark

Kafka

Consensus

Understanding Consensus: Two Generals, FLP, & Byzantine Generals

Two-phase Commit

State Machine Replication

Paxos

Raft

Epilogue

Introduction to Distributed Databases

Motivation

What we will learn