Motivation

Key-Value stores are distributed hash tables (DHT). They are useful in many situations such as storing user sessions in a web application and for building No-SQL databases. In a distributed environment, it has proved very challenging to scale traditional databases with strong consistency and high availability. A key-value store only binds a key to a specific value and does not assume anything about the structure of the key or the value (except the keys should be unique). These simpler semantics help a key-value store to scale at the global level with high availability and a spectrum of options for data consistency. In the context of the CAP theorem, key-value stores can either be consistent or available when there are network partitions. At configuration time, a key-value store can be instantiated to favor of consistency and availability.

Many real-world services only need primary-key access to a data store instead of traditional OLTP (Online Transaction Processing) databases. Examples include bestseller lists, shopping carts, customer preferences, session management, sales rank, and product catalog.

(ToDo: Above picture copied from https://hazelcast.com/glossary/key-value-store/ We need our own illustration)

Key-Value stores use many servers for the storage and retrieval of data. A single-node-based hashtable falls short due to one or more of the following reasons:

  • No matter how big a server we could get, data storage and query requirements can not be met by this server.
  • Failure of this one mega-server will mean service downtime for everyone.

While designing a key-value store we will exercise many of the distributed systems concepts such as:

  • consistent hashing
  • logical clocks
  • data consistency
  • sloppy quorums
  • gossip-based distributed failure detection and membership protocol

We will also see how good engineering often goes hand-in-hand with a solid underpinning for a practical system. We will draw on Amazon’s Dynamo Key-Value storeDynamo: amazon’s highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles (SOSP '07). Association for Computing Machinery, New York, NY, USA, 205–220. for this purpose due to its influence on subsequent key-value stores (like Casandra and DynamoDB).

Rich programming model provided by traditional DBMS might not be required by many applications. Using RDBMS for such applications is often expensive both in terms of dollars and performance.

User-facing API

Key-Value stores, like ordinary hash tables, provide two primary functions:

  • Get: given a key, provide the associated value.

When data is replicated, the operation of locating the object replica that is associated with a specific key is hidden from the end-user and is done by the system. If the store was configured with a weaker data consistency model (for example eventual consistency, there might be more than one value returned against a key, as we will see later).

  • Put: store the value associated with the key

The system automatically determines where data should be placed. Additionally system often keeps metadata about the stored object as well. Such metadata can include the version of the object.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy