Caching Strategies
Explore how caching alleviates read-heavy hotspots by intercepting repetitive requests, and understand client-side, server-side, and distributed cache tiers. Learn cache invalidation strategies like write-through, write-behind, and cache-aside, plus manage cache size with eviction policies. Discover operational hazards such as cold starts and thundering herds, and how to monitor cache effectiveness. This lesson equips you to design caching layers that balance latency, consistency, and complexity.
Consider a partitioned and replicated PostgreSQL cluster handling 50,000 reads per second. The sharding strategy is correct, replicas are healthy, and failover is tested. Yet the system is buckling. Why? Because 80% of those requests target the same 5% of data (product listings, user profiles, session tokens), creating read-heavy hotspots that overwhelm replicas regardless of how well the data layer is distributed. Replication spreads data across nodes, but it does not eliminate redundant work when thousands of concurrent requests ask for the identical row.
This is where caching enters the architecture. Caching intercepts repetitive reads before they reach the database by storing precomputed or recently accessed data in a faster storage layer positioned closer to the consumer. It acts as an absorption layer that shields the database from read amplification.
This lesson covers cache types, invalidation strategies, the tension between cache size and data freshness, and the production tooling that makes caching operational. Caching also introduces its own consistency challenges that must be deliberately managed alongside whatever replication consistency model the database provides.
Types of caching
Caching operates at multiple tiers in a system’s request life cycle, and each tier addresses a different latency boundary.
Client-side caching: The browser or mobile client stores responses locally using HTTP cache headers such as
Cache-ControlandETag. CDN edge caches like CloudFront and Fastly also fall into this tier. Client-side caching eliminates network round-trip, meaning the request never leaves the user’s device or stops at a nearby edge node.
Server-side caching: The application process maintains an in-memory data structure, such as a local hash map or an in-process LRU cache, that avoids database queries for hot data. This tier eliminates database round trips but is limited to a single node’s memory, so each application instance maintains its own independent copy.
Distributed caching: A shared cache cluster, such as Redis or Memcached, sits between the application tier and the database. Every application instance queries the same cache, eliminating per-node duplication while ...