Design approach

There are a number of different approaches to making key-value stores, each of which can result in a different system.

For example, some are solely in the RAM and not backed by persistent storage. While these can provide top-notch performance, they are costly and primarily support fast reads only. The key-value cache will need to be rebuilt on any server failure from the source store.

On the other hand, for some key-value stores, everything is stored in a persistent disk; this is economical but low in performance due to inherently slow IO operations. However, there have been suggestions to improve performance in such stores. Ideally, a key-value store should use memory and storage to provide acceptable dollar cost and performance. However, flexibility in using these two major resources depends on our approach to designing a key-value store.

Our experience with key-value store designs shows that it is challenging to get the best of both worlds (economic and fast reads and writes) using a single key-value store. In our design, we will use a neat idea where we collectively use many kinds of stores, each optimized for a specific goal, and providing a good trade-off point. This approach is called a multi-store approach. In the following section, we will elaborate on why single-store approaches limit us from achieving our goals and then show how multi-store design copes with these limits.

Single-store approach

The single-store approach uses only one key-value store with features to improve performance. Generally, a single-store key-value pair has three main components:

  1. A filterThis filters requests and only allows those requests to pass through for which keys are stored in the key-value store, stored in memory, reduces disk reads by checking if a key is stored or not. Read requests that pass the filter result in disk reads.

  2. An in-memory index to provide the address of a key in storage.

  3. A storage data layout that contains all of our key-value entries.

A single-store approach makes it hard to simultaneously achieve all our design goals (high-performance with efficient use of resources). Let's look at some single-store key-value stores.

HashCache

HashCache is a single-store key-value store. We should consider the following two features of HashCache:

  1. It has no in-memory index. Instead, it has a hash table in storage. This reduces the use of expensive memory.

  2. HashCache keeps track of its cached objects via a special in-memory table. It shifts the workload of confirming cache misses to memory and is optimized to use low memory.

Note: The in-memory table differs from a standard in-memory index. This table is only for cache, used for storing frequently accessed items. The size of this table does not increase proportionally with the number of entries stored as it would with the in-memory index.

HashCache is efficient with memory since it only uses memory for cache and stores all its keys in a hash table in storage. However, it is hard to scale.

HashCache stores all its entries in an in-storage hash table. Scaling it would require a large hash table. A large hash table will take up a significant portion of storage. This will increase the chances of keys in consequent requests pointing to values far from each other on storage. For these requests, the storage hardware will first find the address of the value for the first key, then the second key. Such requests are called random requests. Random reads require searching for the address of the value to return. Similarly, random writes require searching for the address to write the value. We need to note that for storage, regardless of the hardware used (disk, SSD, or flash), sequential reading and writing are faster than random reading and writing.

Log-based key-value stores

We can solve the problem above by ensuring our system writes sequentially to storage—appending new entries to a log. Some single-store key-value stores that apply this technique are FAWN-DSFlashStore, and SkimpyStash.

FAWN-DS uses an in-memory hash table based on a partial key a part of the key to locate the key on the in-storage log. It appends a new entry to the log and stores its offset on the hash table. Both FlashStore and SkimpyStash maintain a buffer in memory, write to storage after the buffer is full, and update their in-memory hashtable, which resolves to a location in storage. Such buffered writing is efficient in terms of throughput, latency, and using write-cycles of flash disks.

Another example of a log-based KV store is the Microsoft FASTER.

The problem with a single-store approach

Writing sequentially to storage and keeping addresses in memory solves the problem of degraded performance with random requests. However, such a system is hard to scale. As the size of the key-value store increases, so does its per-key memory consumption because of the in-memory hash table.

While a single-store approach keeps things simple, it limits our ability to achieve a better design.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.