Deepen PracticeOrdered learning track

Memory Engineering: Encoding, Eviction, TTL, Fragmentation, and Hot Keys

Learn Java Redis In Action - Part 026

Production Redis memory engineering for Java engineers covering memory model, key/value overhead, encodings, TTL behavior, eviction policies, fragmentation, hot keys, large keys, capacity planning, Java serialization impact, and operational runbooks.

[2026-07-02]24 min read4627 words

In This Lesson

1. Kaufman Skill Decomposition 2. Redis Memory Is Not Just Your Values 3. Memory Observability Vocabulary

PrevNext

Lesson 2634 lesson track19–28 Deepen Practice

#java#redis#memory#eviction+6 more

Part 026 — Memory Engineering: Encoding, Eviction, TTL, Fragmentation, and Hot Keys

Part 025 covered transactions, Lua scripts, Redis Functions, and atomic workflows. Now we shift to the resource that defines Redis more than anything else:

Memory.

Redis is fast because your working data is primarily in memory. That means memory is not merely a storage concern. It is also:

a latency concern
a reliability concern
a cost concern
a scaling concern
a data modeling concern
an eviction correctness concern
a Java serialization concern

The core mental model:

Redis memory engineering is the discipline of designing key count, value size, data structure encoding, TTL distribution, eviction policy, replication overhead, and hot-key behavior so memory pressure remains predictable under normal load, spikes, failover, and data growth.

1. Kaufman Skill Decomposition

The skill is not “set maxmemory”. The real skill is:

Given a Redis workload, estimate memory growth, choose structures and TTLs intentionally, prevent hot/large key pathologies, configure eviction safely, and operate with enough headroom for persistence, replication, failover, and fragmentation.

Breakdown:

Sub-skill	What you must be able to do
Memory mental model	Understand used memory, RSS, overhead, allocator fragmentation, and dataset size
Data structure sizing	Estimate key/value overhead and cardinality-driven memory growth
Encoding awareness	Know how small hashes/lists/sets/zsets can use compact encodings and when they expand
TTL engineering	Design expiration semantics, jitter, cleanup, and lifecycle ownership
Eviction policy	Select `noeviction`, LRU/LFU/random/TTL policies based on correctness
Capacity planning	Budget dataset, overhead, replicas, buffers, persistence, and growth headroom
Hot key mitigation	Detect and reduce concentrated QPS on individual keys or hash slots
Large key mitigation	Avoid huge values/memberships that cause latency, memory, and migration issues
Java payload discipline	Measure serialized size, compression trade-offs, and schema overhead
Operational response	Diagnose memory spikes, fragmentation, eviction storms, and OOM risk

Kaufman-style outcome:

After this part, you should be able to review a Redis schema and predict whether it will fail by memory growth, hot keys, eviction semantics, fragmentation, or serialization bloat.

2. Redis Memory Is Not Just Your Values

A common beginner estimate:

memory = sum(serialized value sizes)

This is wrong.

A better model:

Redis memory ≈
  key bytes
+ value bytes
+ object metadata
+ data structure overhead
+ allocator overhead
+ expiration metadata
+ client buffers
+ replication backlog/buffers
+ AOF/RDB rewrite overhead
+ fragmentation
+ temporary command/result memory
+ module/index memory

For Redis Search/JSON/Time Series/probabilistic/vector features, add:

+ index structures
+ token dictionaries
+ posting lists
+ vector index memory
+ labels/metadata
+ compaction/downsampled series

The practical consequence:

Small keys and small values are not free. High key count can be expensive even when each value is tiny.

3. Memory Observability Vocabulary

You cannot engineer what you cannot name.

Important signals from INFO MEMORY and related tooling:

Signal	Meaning	Why it matters
`used_memory`	Memory allocated by Redis allocator	Main application-level memory signal
`used_memory_human`	Human-readable version	Quick inspection
`used_memory_rss`	Memory seen by OS resident set	Includes fragmentation/allocator behavior
`mem_fragmentation_ratio`	RSS divided by allocator-used memory approximation	High ratio may indicate fragmentation
`maxmemory`	Configured memory ceiling for data	Controls eviction/OOM behavior
`maxmemory_policy`	Eviction behavior	Defines correctness under pressure
`evicted_keys`	Count of evicted keys	Indicates pressure and possible data loss for cache keys
`expired_keys`	Count of expired keys	Indicates TTL lifecycle activity
`keyspace_hits/misses`	Cache efficiency	Helps reason about memory usefulness
`mem_clients_normal`	Client buffer memory	Spikes can indicate slow clients or huge responses
`mem_replication_backlog`	Replication backlog memory	Needed for replica partial resync
`allocator_frag_ratio`	Allocator fragmentation detail	More precise fragmentation signal

Do not use one number alone. Use a narrative:

Dataset grew 18% week over week.
used_memory is at 76% of maxmemory.
RSS is 1.35x allocator-used memory.
Evictions started after traffic spike.
Top key pattern is session hash with 20M keys.
Average serialized payload increased after release 2026.07.02.

That is engineering.

4. Key Count vs Value Size

Two Redis systems can use the same total value bytes but behave very differently.

System A: many tiny keys

10,000,000 keys × 40-byte values

Risks:

key metadata overhead dominates
expiration metadata if TTL exists
dictionary resizing cost
keyspace scanning is expensive
cluster slot distribution matters
backup/restore contains huge key count

System B: fewer larger structures

100,000 hashes × 100 fields

Risks:

large hash access can become heavy
partial TTL may be harder unless using field expiration features
hot parent key risk
eviction removes entire parent key
rebalancing/migration can be heavier

Decision principle

Requirement	Prefer
independent TTL per item	separate keys or field-expiration-aware design
atomic group update	one hash or same-slot keys
frequent partial field update	hash
huge per-user collection	shard into buckets
independent eviction	separate keys
low key overhead	grouped structure
avoid hot parent	separate/sharded keys

There is no universally correct shape. The shape follows lifecycle, access pattern, and failure model.

5. Key Naming Memory Discipline

Readable keys are good. Excessively verbose keys are expensive at scale.

Bad at high cardinality:

production:identity-service:tenant:tenant-123456789:user:user-987654321:session:session-abcdef:metadata:v1

Better:

sess:v1:{tenant-123456789}:user-987654321:session-abcdef

Even better if the tenant/user context is in the value and the key only needs uniqueness:

sess:v1:{tenant-123456789}:session-abcdef

Trade-off:

Key property	Benefit	Cost
Human-readable	easier debugging	more bytes
Short prefix	memory efficient	less self-documenting
Hash tag	Cluster atomicity/routing	hot slot risk if low-cardinality
Version marker	migration control	extra bytes
Multi-dimension key	direct lookup	cardinality explosion

Rule:

A key format should be short enough for scale and structured enough for operations.

6. Value Size Discipline in Java

Java engineers often accidentally bloat Redis values through serialization.

Common causes:

storing full DTOs instead of read-model slices
including null/default fields
serializing class metadata
using Java native serialization
storing nested object graphs
storing debug fields in hot keys
storing duplicated text labels in millions of entries
gzip-compressing tiny values and increasing size
JSON field names repeated in every object

Example:

{
  "customerIdentifier": "cust-123",
  "customerLifecycleStatus": "ACTIVE",
  "customerSegmentationTier": "GOLD",
  "customerRiskAssessmentSummary": {
    "currentRiskLevel": "LOW",
    "lastRiskEvaluationTimestamp": "2026-07-02T14:00:00Z"
  }
}

For a hot read model, maybe this is enough:

{"id":"cust-123","st":"A","tier":"G","risk":"L","rev":42}

Do not blindly minify everything. But for millions of keys, payload envelope matters.

Java payload measurement test

@Test
void serializedCustomerCachePayloadMustStaySmall() throws Exception {
    CustomerCacheEntry entry = sampleEntry();

    byte[] bytes = objectMapper.writeValueAsBytes(entry);

    assertThat(bytes.length)
        .as("customer cache payload size")
        .isLessThanOrEqualTo(512);
}

This is a production-grade practice:

Treat serialized Redis payload size as an API contract.

7. Encoding Awareness

Redis internally uses optimized encodings for small data structures. You do not usually control the exact encoding directly in application code, but you must understand the pattern:

Small compact structures are efficient until thresholds are crossed; after that Redis may use a less compact representation optimized for larger operations.

Examples of encoding-aware thinking:

Structure	Compact when	Risk when it grows
Hash	few/small fields	many fields or large values increase overhead
List	compact sequential values	huge lists create latency and migration issues
Set	small integer-like members may be compact	arbitrary strings/many members require hash table representation
Sorted Set	small compact representation possible	large zsets maintain score ordering with more overhead
JSON	expressive document	nested documents and indexes can multiply memory
Search index	fast query	index memory can exceed source document memory
Vector index	semantic retrieval	embeddings and index structures are memory-heavy

Do not memorize internal thresholds as architecture. They can change by Redis version/config. Instead, measure with production-like data using:

MEMORY USAGE key
OBJECT ENCODING key
INFO MEMORY

Practical workflow

Generate 100k realistic keys.
Load into Redis.
Measure total memory.
Measure representative MEMORY USAGE.
Increase field count/value size.
Observe memory slope.
Repeat with alternative modeling.

Architecture should be based on measured slope, not intuition.

8. TTL Engineering

TTL is not just cleanup. TTL is part of correctness.

Questions:

Question	Why it matters
Who owns the lifecycle?	DB row, session, event, cache writer, worker?
Is expiry correctness-sensitive?	lock/idempotency/session semantics depend on it
Is TTL absolute or sliding?	impacts refresh traffic and memory growth
Should TTL have jitter?	avoids synchronized expiration storm
What happens if TTL is missing?	memory leak or durable state?
What happens if key expires early?	duplicate processing or stale denial?
What happens if key never expires?	unbounded memory growth

TTL types

Type	Example	Risk
Fixed TTL	cache product for 10 minutes	stampede at boundary
Sliding TTL	extend session on access	write amplification
Logical TTL	store expiry timestamp in value	requires app enforcement
Jittered TTL	10 min ± random 60 sec	harder exact debugging
No TTL	durable Redis-owned state	must have explicit cleanup/capacity plan

TTL jitter

Bad:

redis.setex(key, 3600, value);

for millions of keys written by the same batch.

Better:

int baseSeconds = 3600;
int jitter = ThreadLocalRandom.current().nextInt(-300, 301);
redis.setex(key, baseSeconds + jitter, value);

Jitter prevents synchronized expiration causing load spikes.

TTL invariant test

@Test
void cacheWriterMustAttachTtl() {
    productCache.put(productId, payload);

    long ttl = redis.ttl("product:v1:" + productId);

    assertThat(ttl).isGreaterThan(0);
}

For cache keys, missing TTL is usually a bug. For durable Redis-owned indexes, TTL might be wrong. Make it explicit.

9. Expiration Is Not Deletion Scheduling Precision

Redis expiration should not be modeled as an exact scheduler.

Key expiry means:

Redis will make the key unavailable after TTL semantics are met, but operational deletion timing and memory reclamation are implementation concerns.

Implications:

expired keys can be removed passively when accessed
active expiration also samples and cleans keys
memory pressure and workload shape affect observed cleanup behavior
keyspace notifications for expiry are signals, not durable events
expiration should not be the only trigger for critical workflows

Bad design:

When payment hold key expires, treat that as the official cancellation event.

Better:

Payment hold has expiresAt in durable database.
Scheduler scans due holds.
Redis TTL is acceleration/cache cleanup only.

TTL is great for lifecycle cleanup. It is not a replacement for durable scheduling when correctness matters.

10. Eviction Policy Mental Model

maxmemory defines memory ceiling. maxmemory-policy defines what Redis does when memory is full and a write needs more memory.

Eviction is not cleanup. Eviction is pressure response.

Common policies:

Policy	Candidate keys	Selection
`noeviction`	none	writes fail when memory is full
`allkeys-lru`	all keys	approximated least recently used
`volatile-lru`	keys with TTL	approximated least recently used
`allkeys-lfu`	all keys	approximated least frequently used
`volatile-lfu`	keys with TTL	approximated least frequently used
`allkeys-random`	all keys	random
`volatile-random`	keys with TTL	random among expiring keys
`volatile-ttl`	keys with TTL	keys nearest expiration first

Correctness selection

Workload	Safer policy
Pure cache, all keys discardable	`allkeys-lru` or `allkeys-lfu`
Cache where only TTL keys are discardable	`volatile-lru` / `volatile-lfu`
Redis as durable-ish state store	`noeviction` plus alerting
Mixed durable state and cache in same Redis	Prefer separate Redis instances; if not, use volatile policy carefully
Lock/idempotency/session critical enough to not randomly disappear	avoid eviction-based correctness; reserve memory or separate deployment

Important principle:

Eviction is acceptable only for data whose disappearance is part of the design.

If a key disappearing breaks correctness, it must not be subject to eviction.

11. Cache Eviction vs Business State Loss

Consider these keys:

cache:v1:product:123
idem:v1:payment:req-abc
lock:v1:invoice:inv-9
session:v1:user:u-1
quota:v1:tenant:t-1:minute

If memory pressure evicts them:

Key	Impact of eviction
product cache	cache miss, usually OK
idempotency marker	duplicate payment risk
lock key	concurrent processing risk
session	user logout or security issue
quota key	limit bypass or reset

This is why mixing Redis roles is dangerous.

Recommended separation:

When cost forces sharing, tag and monitor key groups separately and choose volatile policies with discipline. But for high-risk systems, separate the blast radius.

12. Capacity Planning Model

A useful Redis capacity estimate:

required_memory =
  measured_dataset_memory
× growth_factor
× peak_factor
× replication_factor_overhead
× fragmentation_factor
+ operational_headroom

But do not blindly multiply guesses. Measure.

Capacity worksheet

Dimension	Example
current key count	25,000,000
average key bytes	42
average value bytes	380
measured average memory/key	680 bytes
current dataset memory	~17 GB
90-day growth	1.4x
peak batch load	1.2x
fragmentation/headroom	1.3x
operational reserve	25%
target memory	~37 GB

Formula:

17 GB × 1.4 × 1.2 × 1.3 = 37.1 GB

If maxmemory is 32 GB, the system is already on a collision course.

Headroom categories

Headroom	Why needed
traffic spike	hot period creates more temporary data
release growth	new fields/keys after deployment
retry storm	duplicate request state grows
replication	backlog/buffers need space
persistence	rewrite/copy-on-write can increase memory pressure
fragmentation	allocator/OS memory behavior
failover	topology changes and client reconnection

Top 1% engineers plan for these before the incident.

13. Large Key Pathology

A large key is not only a memory problem. It is also a latency and operations problem.

Examples:

SET tenant:all-users -> 5 million members
HASH tenant:profile-cache -> 20 million fields
ZSET global:leaderboard -> 100 million members
STRING report:latest -> 80 MB JSON

Risks:

slow command execution
huge network responses
client memory spikes
blocking deletes unless lazy deletion is used
replication overhead
cluster migration pain
backup/restore cost
eviction removes too much at once
impossible per-member TTL unless explicitly modeled

Large key detection

Use operational tools and command patterns:

MEMORY USAGE key
HLEN key
SCARD key
ZCARD key
LLEN key
STRLEN key

For broad inspection, use sampling tools rather than full blocking scans on production hot paths.

Mitigation patterns

Problem	Mitigation
huge set per tenant	shard by bucket: `set:{tenant}:bucket:{n}`
huge sorted set leaderboard	partition by region/time/tier; keep top-N materialized
huge JSON blob	split into fields or read-model slices
huge hash	shard by field hash or lifecycle boundary
huge delete	use `UNLINK` instead of `DEL` where appropriate
huge read	paginate with `HSCAN`/`SSCAN`/`ZSCAN` or maintain smaller indexes

Bucketed set example

int bucket = Math.floorMod(userId.hashCode(), 128);
String key = "tenant-users:v1:{" + tenantId + "}:b" + bucket;
redis.sadd(key, userId);

Read all users requires querying all buckets. But hot writes and memory operations become more distributed.

14. Hot Key Pathology

A hot key is a key receiving disproportionate traffic.

Examples:

config:v1:global
feature-flags:v1:all
tenant:v1:{mega-tenant}:quota
leaderboard:v1:global
feed:v1:celebrity-user

Hot keys create:

server CPU concentration
cluster slot concentration
network bottleneck
increased p99 latency
failover amplification
client-side retry storms

Hot key is not always high memory

A 20-byte key can be hot. A 50 MB key can be cold. Treat hotness and size as separate dimensions.

Hot key mitigation

Use case	Mitigation
global config	local in-process cache + pub/sub invalidation hint
global counter	sharded counters + periodic aggregation
global leaderboard	partition + top-N merge
celebrity feed	fanout-on-write or precomputed shards
per-mega-tenant quota	sub-shard by route/user/client then aggregate
session touch hotness	coalesce writes, sliding TTL threshold

Sharded counter

counter:v1:{tenant-123}:shard:0
counter:v1:{tenant-123}:shard:1
...
counter:v1:{tenant-123}:shard:63

Increment:

int shard = ThreadLocalRandom.current().nextInt(64);
redis.incr("counter:v1:{tenant-123}:shard:" + shard);

Read:

long total = LongStream.range(0, 64)
    .map(i -> Long.parseLong(redis.get("counter:v1:{tenant-123}:shard:" + i)))
    .sum();

Trade-off:

Benefit	Cost
spreads write load	read requires aggregation
reduces hot key pressure	exact real-time limit harder
improves Cluster distribution if hash tags vary	multi-key atomicity harder

Careful: if all shards use the same hash tag, they remain in the same Cluster slot. That may be required for atomicity, but it does not distribute across shards. Choose based on the invariant.

15. Fragmentation and RSS

Redis memory fragmentation means the OS resident memory can be higher than Redis logical allocated memory.

Simplified model:

used_memory = memory Redis allocator believes it uses
used_memory_rss = memory pages resident in OS
fragmentation = rss / used_memory-ish

High fragmentation can happen after:

many keys expire/delete
workload shifts from large to small values
allocator cannot return pages quickly
large temporary allocations
persistence rewrite/copy-on-write behavior

Symptoms:

used_memory drops but RSS stays high
container memory limit gets pressured
host-level memory alert fires while Redis logical memory looks fine
fragmentation ratio rises after mass expiration/deletion

Mitigations:

Mitigation	Notes
avoid massive synchronized deletes	spread deletion over time
use TTL jitter	reduces simultaneous expiration
use lazy deletion where appropriate	avoids blocking large synchronous free
enable/tune active defragmentation when suitable	operational setting; test first
restart during maintenance if necessary	last-resort memory compaction
avoid huge value churn	redesign large payload lifecycle

Do not assume memory returned to Redis is instantly returned to the OS. Containers make this more visible because cgroup limits are strict.

16. Persistence and Memory Pressure

Persistence can increase memory risk.

During RDB snapshot or AOF rewrite, copy-on-write behavior can cause additional memory usage when pages are modified while the child process writes data.

Practical implication:

A Redis instance that is safe at 90% memory during normal operation may be unsafe during persistence rewrite or heavy write load.

Operational rules:

keep memory headroom for persistence operations
avoid massive write spikes during rewrite windows
test snapshot/rewrite under realistic write load
monitor fork time and copy-on-write memory
avoid running at the edge of container memory limits

If Redis is pure cache and persistence is disabled, this pressure may be lower. If Redis owns recoverable state, persistence headroom is part of correctness.

17. Eviction Storms

An eviction storm occurs when Redis continuously evicts keys but incoming writes keep exceeding memory.

Symptoms:

evicted_keys rises quickly
hit rate drops
DB load increases due to cache misses
application latency increases
Redis CPU rises
retry traffic increases
more keys are regenerated
more writes trigger more eviction

Response:

Identify key patterns causing growth.
Temporarily reduce write amplification if possible.
Increase memory or scale out if capacity is truly insufficient.
Shorten TTL for low-value cache groups.
Disable/regulate batch jobs writing cache.
Protect critical Redis clusters from cache churn.
Review release that changed payload size/cardinality.

Do not only “flush Redis” unless you understand the downstream blast radius. A full flush may create a database stampede.

18. Memory Policy by Redis Role

Different Redis roles need different memory policies.

Redis role	Memory strategy
read cache	high memory utilization allowed; LRU/LFU eviction acceptable
session store	noeviction or strict volatile policy; missing session is user-visible/security relevant
idempotency store	noeviction or reserved memory; eviction can cause duplicate side effects
rate limiter	enough headroom; eviction can bypass limits
lock/lease store	avoid eviction; lock disappearance changes coordination semantics
delayed queue	noeviction; eviction loses scheduled work
stream processing	noeviction plus trim policy; eviction is not queue management
search/vector	capacity plan index memory; eviction may corrupt expected query completeness
metrics/time series	retention/compaction; eviction only if approximate/optional

This is why one shared Redis for everything is easy at first and painful later.

19. Java Client Memory Behavior

Redis memory is not the only memory in the system. Java clients can also blow up.

Common client-side issues:

unbounded async command futures
huge pipeline result accumulation
large MGET responses
deserializing huge values into object graphs
reactive stream without backpressure discipline
connection output buffer growth during slow server/network
retry queues during Redis outage
logging large payloads

Bounded pipelining

Bad:

for (String key : millionKeys) {
    async.get(key);
}

This creates a huge number of outstanding futures.

Better:

int batchSize = 500;

for (List<String> batch : Lists.partition(keys, batchSize)) {
    List<RedisFuture<String>> futures = new ArrayList<>();
    for (String key : batch) {
        futures.add(async.get(key));
    }

    for (RedisFuture<String> future : futures) {
        process(future.get(200, TimeUnit.MILLISECONDS));
    }
}

Still not perfect, but bounded.

Payload guardrail

At infrastructure boundary:

public byte[] serializeForRedis(Object value, int maxBytes) {
    byte[] bytes = serializer.serialize(value);
    if (bytes.length > maxBytes) {
        throw new RedisPayloadTooLargeException(bytes.length, maxBytes);
    }
    return bytes;
}

Do not let accidental large objects enter Redis silently.

20. Compression Trade-offs

Compression can save memory but costs CPU and latency.

Use compression when:

values are large enough to benefit
data is compressible
Redis memory is more constrained than CPU
p99 latency budget can absorb compression/decompression
payload is not frequently partially updated

Avoid compression when:

values are tiny
QPS is extremely high
CPU is already saturated
you need field-level updates
compression hides schema bloat instead of fixing it

Decision pattern:

if (bytes.length >= compressionThresholdBytes) {
    return compress(bytes);
}
return bytes;

Test with realistic data. Do not assume JSON compresses enough to justify CPU cost on hot paths.

21. Memory-Safe Data Modeling Examples

Example 1: Session store

Naive:

session:{sessionId} -> huge JSON containing profile, permissions, preferences, cart, risk state

Better:

session:v1:{tenantId}:sid:{sessionId} -> small auth/session core
session-perm:v1:{tenantId}:sid:{sessionId} -> permission snapshot if needed
session-risk:v1:{tenantId}:sid:{sessionId} -> risk snapshot if needed

Rationale:

session touch does not rewrite huge object
different TTLs possible
memory growth is visible by group
fewer large value spikes

Example 2: Product cache

Naive:

product:v1:{productId} -> full catalog aggregate with all attributes and relationships

Better:

product-summary:v1:{productId} -> list/search result projection
product-detail:v1:{productId} -> detail projection
product-price:v1:{productId}:{currency} -> pricing slice

Rationale:

read path payload matches use case
price invalidation does not evict detail cache
hot summary reads avoid detail bloat

Example 3: Tenant quota

Naive:

quota:v1:all-tenants -> one huge hash

Better:

quota:v1:{tenantId}:minute:{yyyyMMddHHmm}
quota:v1:{tenantId}:day:{yyyyMMdd}

Rationale:

per-tenant lifecycle
Cluster slot based on tenant
avoids global hot key
TTL per window

22. Operational Runbook: Memory Spike

When Redis memory spikes:

Step 1: Establish scope

Ask:

Which Redis instance/cluster?
Which role: cache, session, queue, search, vector?
Did used_memory rise, RSS rise, or both?
Did evictions start?
Did hit rate drop?
Was there a recent deployment/batch/import?

Step 2: Inspect Redis signals

INFO MEMORY
INFO STATS
INFO KEYSPACE
CONFIG GET maxmemory
CONFIG GET maxmemory-policy
SLOWLOG GET 20

Also inspect application metrics:

Redis write QPS by operation
payload size histogram
cache put rate
TTL distribution
error/retry rate
deployment timeline

Step 3: Find growth pattern

Use safe sampling and known key prefixes.

Check representative keys:

MEMORY USAGE some:key
TTL some:key
TYPE some:key
HLEN some:hash
SCARD some:set
ZCARD some:zset
STRLEN some:string

Step 4: Stabilize

Options:

Action	Use when	Risk
increase memory	real capacity shortage	cost, hides leak
scale/shard	hot slot or dataset growth	migration complexity
shorten TTL	low-value cache	hit rate drop
stop batch writer	runaway import/cache warmup	stale data
disable non-critical cache writes	pressure relief	more DB load
delete specific bad prefix	known accidental keys	stampede or data loss
flush all	last resort	major downstream blast radius

Step 5: Prevent recurrence

Create follow-up tasks:

payload size regression test
key cardinality alert
TTL missing alert
memory budget per key prefix
release checklist update
cache warmup throttle
eviction/hit-rate SLO

23. Operational Runbook: Eviction Started

Eviction is not always an outage. For pure cache, it may be expected. For coordination/idempotency/session, it may be severe.

Triage:

Identify maxmemory-policy.
Identify whether evicted keys are discardable.
Check hit rate and downstream load.
Check whether critical keyspaces share the instance.
Check if a release increased payload/cardinality.
Check if TTL is missing on cache keys.
Check if hot keys are being regenerated repeatedly.

Decision:

Observation	Interpretation
evictions + stable hit rate + pure cache	acceptable pressure behavior
evictions + DB spike	cache too small or churn too high
evictions in mixed-state Redis	correctness risk
evictions after release	payload/cardinality regression likely
evictions with many no-TTL cache keys	lifecycle bug

24. Operational Runbook: Hot Key Incident

Symptoms:

one shard has much higher CPU/QPS
p99 spikes for specific operation
Cluster slot imbalance
client timeouts for one route/tenant
Redis commandstats shows high call volume for simple command

Triage:

Identify operation causing load.
Identify key pattern and cardinality.
Determine if key is hot due to global dimension or mega tenant.
Check if local caching is possible.
Check if sharding breaks correctness.
Apply immediate throttle/coalescing if needed.

Mitigation ladder:

local in-process cache
→ request coalescing
→ TTL/stale-while-revalidate
→ sharded key
→ precomputed materialized views
→ topology split
→ product/API redesign

25. Alerts and SLOs

Recommended Redis memory alerts:

Alert	Signal
high memory usage	`used_memory / maxmemory`
fast memory growth	derivative over 5/30/60 minutes
evictions started	`evicted_keys` rate > 0 for critical roles
hit rate drop	cache hit ratio below SLO
fragmentation high	fragmentation ratio above expected baseline
no-TTL cache keys	sampled cache keys with `TTL = -1`
payload regression	application payload histogram p95/p99 increase
large key	sampled `MEMORY USAGE`/cardinality above threshold
hot key/slot	shard CPU/QPS imbalance
client buffer growth	client memory/buffer metrics rising

SLO examples:

Cache cluster:
- p99 Redis command latency < 10 ms
- hit rate > 92%
- evictions allowed but DB fallback must stay within SLO

Coordination cluster:
- evicted_keys rate must be 0
- used_memory < 70% maxmemory sustained
- write OOM must be 0

Queue/stream cluster:
- evicted_keys rate must be 0
- pending/lag bounded
- persistence rewrite memory headroom maintained

26. Review Checklist

Before approving a Redis schema or feature:

27. 20-Hour Practice Block

Hour 1–3: Measure payloads

Take three real DTOs:

session
product cache
idempotency record

Serialize them with your actual Redis serializer. Measure p50/p95/p99 size. Add regression tests.

Hour 4–6: Load realistic data

Using Testcontainers or a local Redis:

generate 100k realistic keys
load them
measure INFO MEMORY
measure MEMORY USAGE for samples
compare string vs hash vs JSON modeling

Hour 7–9: TTL lifecycle

Implement:

fixed TTL cache
jittered TTL cache
sliding session TTL with refresh threshold
missing TTL detector for cache prefix

Hour 10–12: Eviction simulation

Set small maxmemory. Try:

allkeys-lru
allkeys-lfu
volatile-lru
noeviction

Observe application behavior. Document what breaks.

Hour 13–15: Large key experiment

Create:

one huge set
bucketed sets
one huge JSON string
split projection keys

Measure latency and memory.

Hour 16–18: Hot key experiment

Simulate:

one global counter
sharded counter
local cache + Redis refresh

Measure p99 and Redis QPS.

Hour 19–20: Memory review doc

Create a memory budget for one feature:

feature:
key patterns:
cardinality:
value size:
ttl:
eviction impact:
hot key risk:
large key risk:
maxmemory impact:
alerts:
rollback:

28. Part Summary

Redis memory engineering is not just capacity. It is correctness and performance design.

Key lessons:

Redis memory includes keys, values, metadata, indexes, buffers, fragmentation, replication, and persistence overhead.
High key count can be expensive even when values are tiny.
Large keys damage latency, migration, deletion, and operational safety.
Hot keys damage throughput and p99 even if they are small.
TTL is lifecycle semantics, not just cleanup.
Expiration is not a durable scheduler.
Eviction is acceptable only for data whose disappearance is part of the design.
Mixed-role Redis deployments increase blast radius.
Java serialization can silently multiply memory usage.
Payload size, TTL, key count, and cardinality should have tests and alerts.

Top 1% Redis engineers do not ask only, “How much RAM do we need?” They ask:

What memory growth curve are we creating, what disappears under pressure, what gets hot, what gets large, and what downstream system fails when Redis memory policy activates?

References

Redis Docs — Key eviction: https://redis.io/docs/latest/develop/reference/eviction/
Redis Docs — Memory optimization: https://redis.io/docs/latest/operate/oss_and_stack/management/optimization/memory-optimization/
Redis Docs — EXPIRE command and TTL options: https://redis.io/docs/latest/commands/expire/
Redis Docs — MEMORY USAGE command: https://redis.io/docs/latest/commands/memory-usage/
Redis Docs — INFO command: https://redis.io/docs/latest/commands/info/
Redis Docs — Pipelining and request/response behavior: https://redis.io/docs/latest/develop/using-commands/pipelining/
Redis Docs — Persistence: https://redis.io/docs/latest/operate/oss_and_stack/management/persistence/

Lesson Recap

You just completed lesson 26 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 25

Transactions, Lua Scripts, Functions, and Atomic Workflows

Next Lesson

Lesson 27

Observability and Debugging: Metrics, Slowlog, MONITOR, and Tracing