Preamble

In the last lesson, we decided to use multiple stores in our design, each optimized for a particular purpose. These stores include a write-friendly store, intermediary stores, and a memory-efficient store. Collectively, these stores will help our design achieve good performance and low memory overhead per key. Let's dive into our first store: the write-friendly store.

Our write-friendly store will handle all PUT and DELETE requests. The PUT requests represent new entries or update old entries. The DELETE requests are special operations that request the removal of a key-value entry. For the GET request, we'll later use a higher-level wrapper to find a key in all the stores. Doing so is necessary because, over time, keys trickle from a write-friendly store to a memory-efficient store.

Appending to storage log for fast write

Memory provides fast random access. However, storing all the key-value data in memory will need a lot of RAM (which is fairly expensive and might not be enough to hold all the data). Writing to storage is slow, and storage generally does not perform well with random writes. However, sequential writing to storage is faster than random writing.

Before moving on, let's refresh the difference in random and sequential writing performance for the most commonly used storage hardware.

Random vs. sequential writing

We will not dive into the details of random vs. sequential access. We will compare the differences in random and sequential performances of some modern hardware to emphasize the importance of the considerations made in our design.

Note: We prefer writing and reading traditional rotating disks and newer flash-based disks sequentially for different reasons. Sequential writing is faster on the rotating disks because doing so avoids unnecessary head-seek latencies, and for flash-based disks we ...

Prologue

File Systems

Google File System (GFS)

Google Colossus File System

Facebook's Tectonic File System

Databases

Google Bigtable

Google Megastore

Google Spanner

Key-value Stores

Many-core Key-value Store

Scaling Memcache

SILT

Amazon DynamoDB

Concurrency Management

Two-phase Locking (2PL)

Google Chubby Locking Service

ZooKeeper

Big Data Processing: Batch to Stream Processing

MapReduce

Spark

Kafka

Consensus

Understanding Consensus: Two Generals, FLP, & Byzantine Generals

Two-phase Commit

State Machine Replication

Paxos

Raft

Epilogue

A Write-friendly Store for SILT: Part I

Preamble

Appending to storage log for fast write

Random vs. sequential writing