Architecture

15/01/2025

12 min read

By SyntheBrain Team

Building Distributed Caching with Apache Kafka: A Complete Guide

Yes, Kafka can be used to implement a distributed caching solution, but it requires careful design because Kafka is fundamentally a distributed log, not a key-value store. However, you can architect a robust, scalable caching layer using Kafka as the backbone for cache synchronization, invalidation, and event-driven consistency across microservices.

Design Overview: Kafka-based Distributed Caching System

Goals

Distributed cache shared across 100s of microservices
Handle cache writes, updates, evictions, TTL, and hit/miss logic
No Redis or commercial cache store dependencies
Use Kafka as the source of truth & sync bus
Highly resilient, scalable, and low-latency performance

High-Level Architecture

Figure 1: Modern Kafka-based distributed caching architecture with event-driven synchronization

Architecture Components

1. Local In-Memory Cache (per service instance)

Each microservice has a local in-memory cache (e.g., Caffeine, Guava, or custom implementation).

Key Features:

• Fast access for reads (no network latency)
• Supports TTL & eviction policy locally
• Configurable size limits and cleanup strategies

2. Kafka Topics

Use Kafka topics to sync cache data across instances.

cache-updates

For propagating new/updated records across all instances

cache-evictions

For evictions or TTL-based removal notifications

cache-requests

Optional: for cache miss broadcast and response

Compacted Topics

Use log compaction for latest value retention

3. Cache Manager (library or sidecar)

Shared library used in each microservice (or external sidecar process).

Responsibilities:

• Subscribes to cache-updates and cache-evictions topics
• Updates local in-memory cache based on Kafka events
• Publishes changes when local write/update/evict occurs
• Handles TTL and eviction timers

4. Cache Loader Service

Optional central service that loads data from DB or source-of-truth upon a cache miss.

Function:

Sends result via Kafka to all consumers, ensuring consistent data loading across the distributed cache.

Flow Scenarios

Cache Operations Flow

1. Cache Write/Update Flow

Service A puts key:value in local cache

It publishes {key, value, ttl} to cache-updates topic

All other services consume the message and update their local caches

2. Cache Eviction Flow (manual or TTL)

Service detects expired/evicted item locally

Publishes {key} to cache-evictions topic

All services remove the key from their local caches

3. Cache Miss Scenarios

Option A – Self-load

Miss triggers direct DB/API fetch → update local + publish to Kafka

✅ Simple implementation
✅ Fast response
⚠️ Multiple services may fetch same data

Option B – Broadcast load

1. Service A detects cache miss

2. Publishes to cache-requests

3. Cache Loader Service responds via cache-updates

✅ Centralized loading
✅ Avoids duplicate fetches
⚠️ Additional latency

TTL & Eviction Policies

Time-To-Live (TTL) Management

Use local timers per instance (based on message TTL metadata)
Kafka does not enforce TTL – you manage expiration in memory
Each service manages its own expiration scheduling

Eviction Strategies

LRU (Least Recently Used) - most common
Size-based eviction when cache reaches limits
Handled locally per instance using Caffeine or similar

💡 Pro Tip

Use libraries like Caffeine (Java) or node-cache (Node.js) that provide built-in TTL and eviction policies. This handles the complexity of memory management while you focus on Kafka synchronization.

Implementation Example

Java Spring Boot Cache Manager

@Component
public class KafkaDistributedCache {
    
    @Autowired
    private KafkaTemplate<String, Object> kafkaTemplate;
    
    private final LoadingCache<String, Object> localCache;
    
    public KafkaDistributedCache() {
        this.localCache = Caffeine.newBuilder()
            .maximumSize(10_000)
            .expireAfterWrite(30, TimeUnit.MINUTES)
            .removalListener((key, value, cause) -> {
                if (cause == RemovalCause.EXPIRED) {
                    publishEviction(key);
                }
            })
            .build(this::loadFromDatabase);
    }
    
    public Object get(String key) {
        return localCache.getIfPresent(key);
    }
    
    public void put(String key, Object value, Duration ttl) {
        localCache.put(key, value);
        
        CacheUpdateMessage message = CacheUpdateMessage.builder()
            .key(key)
            .value(value)
            .ttl(ttl.toMillis())
            .timestamp(System.currentTimeMillis())
            .build();
            
        kafkaTemplate.send("cache-updates", key, message);
    }
    
    @KafkaListener(topics = "cache-updates")
    public void handleCacheUpdate(CacheUpdateMessage message) {
        if (!isFromSelf(message)) {
            localCache.put(message.getKey(), message.getValue());
        }
    }
    
    @KafkaListener(topics = "cache-evictions") 
    public void handleCacheEviction(CacheEvictionMessage message) {
        localCache.invalidate(message.getKey());
    }
    
    private void publishEviction(String key) {
        CacheEvictionMessage message = new CacheEvictionMessage(key);
        kafkaTemplate.send("cache-evictions", key, message);
    }
}

This example shows a basic implementation using Spring Boot, Kafka, and Caffeine cache with automatic TTL management and Kafka synchronization.

Trade-offs & Considerations

Concern	Solution
No built-in TTL in Kafka	Use local timers per service
Cache inconsistency (brief)	Accept eventual consistency within TTL bounds
Large payloads	Compress cache entries (e.g. Snappy, GZIP)
Cold start latency	Prewarm on startup or request from loader
Overhead for small keys	Group cache entries by domain to reduce Kafka volume

Recommended Technology Stack

Core Technologies

Apache Kafka (with compacted topics)
Caffeine or Guava for in-memory caching
Micrometer + Prometheus for metrics

Optional Enhancements

Kafka Streams for aggregate cache sync logic
Spring Boot with embedded Kafka client
WebSocket/gRPC for bulk cache refresh

Ready to implement Kafka-based distributed caching?

This architecture provides a scalable, resilient caching solution that can handle hundreds of microservices without the complexity and cost of traditional cache stores.

Get Expert Help More Articles

Thank you for reading this comprehensive guide to Kafka-based distributed caching.