Design Key-value Store
Learn to design a key-value store.
Introduction to key-value stores
Key-value stores are
Requirements
Let’s list the requirements of designing a key-value store to overcome the problems of traditional databases.
Functional requirements
The functional requirements are as follows:
-
Configurable service: Some applications might have a tendency to trade strong consistency for higher availability. We need to provide a configurable service so that different applications could use a range of consistency models. We need tight control over the trade-offs between availability, consistency, cost-effectiveness, and performance.
-
Ability to always write: The applications should always have the ability to write into the key-value storage. If the user wants strong consistency, this requirement might not always be fulfilled due to the implications of the CAP theorem.
-
Hardware heterogeneity: The system shouldn’t have distinguished nodes. Each node should be functionally able to do any task. Though servers can be heterogeneous, newer hardware might be more capable than older ones.
Non-functional requirements
The non-functional requirements are as follows:
-
Scalable: Key-value stores should run on tens of thousands of servers distributed across the globe. Incremental scalability is highly desirable. We should add or remove the servers as needed with minimal to no disruption to the service availability. Moreover, our system should be able to handle an enormous number of users of the key-value store.
-
Available: We need to provide continuous service, so availability is very important. This property is configurable. So, if the user wants strong consistency, we’ll have less availability and vice versa.
-
Fault tolerance: The key-value store should operate uninterrupted despite failures in servers or their components.
API design
Key-value stores, like ordinary hash tables, provide two primary functions, which are get
and put
.
get
function: The API call to get a value isget(key)
.put
function: The API call to get a value isput(key, context, value)
; wherecontext
holds encoded metadata about the object.
Note: This lesson is based on
, which is an influential work in the domain of key-value stores. Dynamo Amazon’s Highly Available Key-value Store (https://assets.amazon.science/ac/1d/eb50c4064c538c8ac440ce6a1d91/dynamo-amazons-highly-available-key-value-store.pdf)
Add scalability
Let’s start with one of the core design requirements: scalability. We will use consistent hashing to achieve it.
Consistent hashing is an effective way to manage the load over the set of nodes. In consistent hashing, we consider having a conceptual ring of hashes from
Point to Ponder
What is the drawback of using consistent hashing?
Use virtual nodes
We’ll use virtual nodes to ensure a more evenly distributed load across the nodes. Instead of applying a single hash function, we’ll apply multiple hash functions onto the same key.
Let’s take an example. Suppose we have three hash functions. For each node, we calculate three hashes and place them into the ring. For the request, we use only one hash function. Wherever the request lands onto the ring, it's processed by the next node found while moving in the clockwise direction. Each server has three positions, so the load of requests is more uniform. Moreover, if a node has more hardware capacity than others, we can add more virtual nodes by using additional hash functions. This way, it'll have more positions in the ring and serve more requests.
We’ve made the proposed design of key-value storage scalable. The next task is to make our system highly available.
Data replication
We have various methods to replicate the storage. It can be either a primary-secondary relationship or a peer-to-peer relationship.
We’ll use a peer-to-peer relationship for replication. We’ll replicate the data on multiple hosts for durability and high availability. Each data item will be replicated at
Each node will replicate its data to the other nodes. We’ll call a node coordinator that handles read or write operations. It’s directly responsible for the keys. A coordinator node ...