Search⌘ K
AI Features

Detailed Design of a Distributed Cache

Discover how to refine a distributed cache design by eliminating single points of failure and improving availability. Implement a configuration service for server discovery and use primary-replica sharding to ensure data consistency. Learn the internal workings of cache servers, including hash maps and LRU eviction policies.

This lesson identifies limitations in the high-level design and refines the architecture to address them.

Find and remove limitations

Before we get to the detailed design, we must resolve three specific challenges:

  • Service discovery: Cache clients have no mechanism to detect when cache servers are added or fail.

  • SPOF and performance: Using a single server for a dataset creates a Single Point of Failure (SPOF). Additionally, frequently accessed data (hotkeys) can overload a single node, degrading performance.

  • Server internals: The design lacks details regarding internal data structures and eviction policies.

Maintain the cache servers list

We will address the service discovery problem first. The following ...