High performance

Here are some design choices we made that will contribute to overall good performance:

We used consistent hashing. Finding a key under this algorithm requires a time complexity of O(log(N)), where N represents the number of cache shards.
Inside a cache server, keys are located using hash tables that require constant time on average.
The LRU eviction approach uses a constant time to access and update cache entries in a doubly linked list.
The communication between cache clients and servers is done through TCP and UDP protocols, which is also very fast.
Since we added more replicas, these can reduce the performance penalties that we have to face if there’s a high request load on a single machine.
An important feature of the design is adding, retrieving, and serving data from the RAM. Therefore, the latency to perform these operations is quite low.

Note: A critical parameter for high performance is the selection of the eviction algorithm because the number of cache hits and misses depends on it. The higher the cache hit rate, the better the performance.

To get an idea of how important the eviction algorithm is, let’s assume the following:

Cache hit service time ( $99.9^{th}$ percentile): 5 ms
Cache miss service time ( $99.9^{th}$ percentile): 30 ms (this includes time to get the data from the database and set the cache)

Let’s assume we have a 10% cache miss rate using the most frequently used (MFU) algorithm, whereas we have a 5% cache miss rate using the LRU algorithm. Then, we use the following formula:

$EAT$ ...

Distributed Cache System

Pub-Sub

Blob Store

TikTok

Uber Eats

NewsFeed

Facebook Messenger

ChatGPT

Evaluation of a Distributed Cache's Design

High performance