Caching in Java Microservices: Redis, Spring Cache & Invalidation

Caching is the highest-leverage performance tool in a microservice platform — and the easiest to get subtly, dangerously wrong. A good cache turns a 200ms database call into a 2ms lookup; a bad one serves stale data, stampedes your database when a hot key expires, or quietly drifts out of sync across instances. This deep dive covers caching in Java microservices: the Spring Cache abstraction, Redis as a distributed cache, invalidation, and the failure modes that cause incidents.

TL;DR: Use Spring’s cache abstraction over Redis for data shared across instances. Always set a TTL — a cache without expiry is a memory leak and a staleness bug. Cache-aside is the default pattern. Plan invalidation up front (it’s the hard part). Protect hot keys from stampedes, design for cache and Redis being unavailable, and never cache without measuring the hit rate.
Tailor your resume to a backend / Java role →
flowchart TD
  Req[Request] --> Q{In cache?}
  Q -->|hit| Ret[Return cached value]
  Q -->|miss| DB[(Database)]
  DB --> Pop[Populate cache with TTL]
  Pop --> Ret
        
Cache-aside: serve from the cache on a hit; load from the source and populate on a miss.

What to cache (and what not to)

Cache data that is read far more than written, expensive to produce, and tolerant of slight staleness: reference data, computed aggregates, the results of slow downstream calls, rendered fragments. Do not cache data that must be perfectly fresh (account balances at the moment of a transaction), is cheap to fetch anyway, or is unique per request (no reuse, so no hit). The cache only helps when the same value is read many times — measure the hit rate to confirm it’s earning its keep.

In-process vs distributed

In-process (Caffeine)Distributed (Redis)
LatencyNanoseconds (local heap)~1ms (network hop)
Shared across instancesNo — each instance has its ownYes
Survives restartNoYes
CapacityBounded by heapLarge, independent

Use Caffeine for small, hot, read-mostly data where per-instance copies are fine. Use Redis when instances must share state, the cache should survive restarts, or entries are too big to hold everywhere. High-throughput systems often layer both — a near cache (Caffeine) in front of Redis — to cut even the 1ms network hop for the hottest keys.

Spring’s cache abstraction over Redis

Spring decouples your code from the cache provider behind annotations. Add the Redis starter, enable caching, and annotate methods — switching providers later is configuration, not a rewrite.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
@Configuration
@EnableCaching
class CacheConfig {
  @Bean
  RedisCacheConfiguration cacheConfig() {
    return RedisCacheConfiguration.defaultCacheConfig()
        .entryTtl(Duration.ofMinutes(10))   // ALWAYS set a TTL
        .disableCachingNullValues();
  }
}

@Service
class ProductService {
  @Cacheable(cacheNames = "product", key = "#id")
  public Product byId(String id) { /* slow lookup, runs only on a miss */ }

  @CachePut(cacheNames = "product", key = "#p.id")
  public Product update(Product p) { /* refresh the cached value */ return save(p); }

  @CacheEvict(cacheNames = "product", key = "#id")
  public void delete(String id) { /* remove on delete */ }
}

@Cacheable returns the cached value on a hit and runs the method only on a miss; @CachePut always runs and updates the cache; @CacheEvict removes entries. The single most important line is the TTL — see below.

Caching patterns

TTLs and the staleness trade-off

Every cached entry needs a time-to-live. A TTL is your safety net: even if invalidation logic misses a case, stale data self-heals when the entry expires. The trade-off is directness — short TTLs mean fresher data but more misses (more load); long TTLs mean more staleness but better hit rates. Tune per data type: seconds for fast-moving data, hours for stable reference data. A cache with no TTL is both a memory leak and a guaranteed staleness bug.

Invalidation — the genuinely hard part

“There are only two hard things in computer science: cache invalidation and naming things.” In a distributed system the challenge is that data changes in one service while cached copies live in Redis (and maybe in each instance’s near cache). Strategies, roughly in order of strength:

Key design matters too: namespace keys clearly (product:{id}), and avoid the temptation to “clear the whole cache” on any change — that turns one write into a stampede.

Cache stampede (thundering herd)

A popular key expires; simultaneously hundreds of requests miss and all hit the database to recompute the same value — a stampede that can knock over the very source you were protecting. Defenses:

Design for the cache failing

Redis is a dependency, and dependencies fail. A cache should be an optimization, not a single point of failure — if Redis is down, the service should fall back to the source (slower but working), not error out. Configure short Redis timeouts so a slow cache fails fast rather than adding latency to every request, and wrap cache access so a Redis outage degrades performance instead of availability. Test this path explicitly.

Operational pitfalls

Takeaways

Caching done right is deliberate: cache read-heavy, expensive, staleness-tolerant data; reach for Redis when state must be shared; always set a TTL; plan invalidation before you ship; protect hot keys from stampedes; and make the system survive the cache being down. Spring’s cache abstraction makes the mechanics easy — the engineering is in the staleness, invalidation, and failure decisions, and in measuring the hit rate so you know the cache is actually paying for itself.

Frequently asked questions

When should I use a distributed cache like Redis instead of an in-memory cache?
Use an in-process cache (Caffeine) for small, hot, read-mostly data local to one instance. Use Redis when multiple service instances must share cached data, when the cache must survive restarts, or when entries are too large to hold per-instance. Many systems layer both (near cache + Redis).

What is a cache stampede and how do you prevent it?
A stampede happens when a popular key expires and many concurrent requests all miss and hit the database at once. Prevent it with request coalescing (a lock so only one thread refills), slightly randomized TTLs to avoid synchronized expiry, and optionally serving stale data while refreshing in the background.

Land your next Java role — tailor your resume with AI →