Caching Patterns
Caching is a read-latency and database-protection tool first. This page focuses on patterns; read scaling covers where caching sits relative to replicas and CDN — see Read scaling for overlap and sequencing.
Cache Reads Before Writes
Section titled “Cache Reads Before Writes”Operational default: expose stable read paths through cache rather than pretending writes “update the cache magically” everywhere.Write-heavy workloads still benefit from selective caching, but blindly caching mutable aggregates without invalidation invites stale reads and coherence bugs.
Cache-Aside
Section titled “Cache-Aside”In cache-aside, application code reads the cache first, loads the backing store on miss, then populates the cache. Pros: straightforward and tolerant of transient cache outages (fall back to DB). Cons: thundering herd on expiry unless mitigated below.
Write-Through
Section titled “Write-Through”Write-through updates cache and backing store together (ordering varies). Improves read consistency immediately after writes at the cost of write latency and write amplification. Use when stale reads after writes are unacceptable and working set fits cache economics.
Eviction Policies
Section titled “Eviction Policies”LRU (least recently used) is a sensible default general-purpose eviction. Some workloads profit from TTL-only expiry with size caps, or LFU for stable hot sets. Tune based on measurable miss rates and eviction churn.
Cache Stampede
Section titled “Cache Stampede”On expiry or bust, many concurrent misses slam the origin.Mitigations: probabilistic early refresh, mutex per key (single-flight), pre-warming critical keys, shorter TTL jitter, or layering (local L1 plus shared L2).
Thundering Herd
Section titled “Thundering Herd”Distinct but related:many clients invalidate or cold-start together. Warm caches before TTL cliff; stagger TTLs; consider graceful staleness (serve slightly old value while rebuilding in background).
Horizontal Cache Scale
Section titled “Horizontal Cache Scale”Redis Cluster (or similarly sharded caches) shards key space across nodes for throughput and memory headroom.Client routing must follow cluster topology moves; resilience planning matters when a node disappears.
Duplicate logic with read scaling pages is intentional: keep read-path narratives in sync when changing assumptions.
Related: Write scaling, Consistency and transactions, glossary (LRU, TTL, Redis).