Detailed Design of a Distributed Cache
Discover how to refine a distributed cache design by eliminating single points of failure and improving availability. Implement a configuration service for server discovery and use primary-replica sharding to ensure data consistency. Learn the internal workings of cache servers, including hash maps and LRU eviction policies.
This lesson identifies limitations in the high-level design and refines the architecture to address them.
Find and remove limitations
Before we get to the detailed design, we must resolve three specific challenges:
Service discovery: Cache clients have no mechanism to detect when cache servers are added or fail.
SPOF and performance: Using a single server for a dataset creates a Single Point of Failure (SPOF). Additionally, frequently accessed data (hotkeys) can overload a single node, degrading performance.
Server internals: The design lacks details regarding internal data structures and eviction policies.
Maintain the cache servers list
We will address the service discovery problem first. The following ...