Build CoreOrdered learning track

Idempotency, Deduplication, and Exactly-Once Illusions

Learn Java Redis In Action - Part 016

Production idempotency and deduplication patterns with Redis and Java: idempotency keys, request replay, retry safety, event dedup, webhook dedup, TTL-bound correctness, state machines, Lua atomic claims, result caching, and exactly-once illusions.

21 min read4177 words
PrevNext
Lesson 1634 lesson track0718 Build Core
#java#redis#idempotency#deduplication+5 more

Part 016 — Idempotency, Deduplication, and Exactly-Once Illusions

Part 015 covered cache consistency. Now we move to a different but related problem:

How do we make retry-heavy distributed systems safe when the same command, event, or webhook may arrive more than once?

Redis is frequently used for:

  • idempotency keys
  • duplicate request suppression
  • webhook deduplication
  • event ingestion guards
  • distributed retry protection
  • temporary uniqueness windows
  • exactly-once-like user experience

But Redis does not magically give exactly-once business semantics. It gives fast atomic primitives that can help you build bounded idempotency.

Bounded idempotency is honest engineering. Exactly-once is usually a slogan.


1. Kaufman Skill Decomposition

Target skill:

Use Redis to design idempotent Java workflows that remain safe under retries, duplicate delivery, client timeouts, worker crashes, and partial failures.

Sub-skills:

Sub-skillWhat you must be able to do
Idempotency mental modelDistinguish duplicate suppression from business idempotency
Key designBuild stable idempotency keys from correct scope and intent
Atomic claimUse Redis SET NX or Lua to claim work safely
In-progress stateRepresent started, completed, failed, and expired states
Result replayReturn same response for duplicate successful requests
Retry safetyAvoid double side effects when clients retry
Dedup windowChoose TTL from retry horizon and business risk
Event dedupHandle duplicate message/event/webhook delivery
Failure recoveryRepair stuck in-progress markers
Source authorityKnow when DB uniqueness must complement Redis

Practice rule:

Do not ask “can Redis block duplicates?” Ask “which duplicates, for how long, at which boundary, and what response should duplicate callers receive?”


2. Idempotency vs Deduplication

These terms are related but not identical.

ConceptMeaning
IdempotencyRepeating the same operation has the same intended effect
DeduplicationDetecting and suppressing duplicate inputs
ReplayReturning stored result for a repeated request
Exactly-onceA system-level claim that an effect happens once despite failures
Effect-onceBusiness side effect is committed at most once
Response-onceCaller receives same response for same idempotency key
Delivery-onceTransport delivers message once; rare in real systems

Example:

POST /payments
Idempotency-Key: payreq_abc123

If the client retries because of timeout:

  • duplicate request should not charge twice
  • duplicate request should receive the original payment result if available
  • if original request is still processing, duplicate should not start another charge

That is idempotency.

Dedup alone would only say:

I have seen this key before, reject it.

That may be insufficient because the client may need the original result.


3. Why Retries Create Duplicates

Distributed systems create duplicates naturally.

The retry is correct from the client’s perspective. The service must decide whether this is:

  • same request replay
  • conflicting request using same key
  • concurrent duplicate
  • old expired duplicate
  • malicious replay

Idempotency is not optional in systems with:

  • mobile clients
  • browser retries
  • load balancer timeouts
  • payment providers
  • webhook delivery
  • async workers
  • message brokers
  • distributed sagas
  • external API calls

4. The Most Important Invariant

For command idempotency:

The idempotency key must represent the caller’s intent, not merely a technical request attempt.

Bad key:

random UUID generated by server per received request

This does not deduplicate client retries because every retry gets a different key.

Better:

client-provided idempotency key scoped to tenant/user/operation

Example key:

idem:payment-create:{tenant-a:user-42}:payreq_abc123

But the key alone is not enough. You must store a fingerprint of the request payload.

Why?

The same idempotency key reused with different payload should not replay old result silently. It should return a conflict.

Payload fingerprint:

sha256(method + path + canonical_json_body + tenant + account)

Stored value:

{
  "state": "COMPLETED",
  "requestHash": "sha256:...",
  "statusCode": 201,
  "responseBody": {...},
  "startedAtEpochMs": 1783000000000,
  "completedAtEpochMs": 1783000001200
}

5. Idempotency State Machine

Simple SET NX dedup is not enough for commands that produce responses. Use a state machine.

Important states:

StateMeaningDuplicate behavior
absentno known requestattempt claim
in_progressrequest currently runningreturn 409/202 or wait bounded
completedsuccessful result storedreplay same result
failed_retryabletransient failureallow retry or return retryable status
failed_finalpermanent failurereplay failure response
expiredtoo old to trustpolicy-specific

For high-risk workflows, store the final authoritative result in the database too. Redis should not be the only evidence of a payment or order command.


6. Basic Atomic Claim with SET NX

Redis SET supports conditional set with expiry:

SET idem:payment:{scope}:K <in-progress-envelope> NX PX 300000

Meaning:

  • NX: only set if key does not exist
  • PX: expiry in milliseconds
  • value: in-progress marker
  • TTL: max time before marker is considered abandoned

Java sketch with Lettuce-style API:

public ClaimResult claim(IdempotencyKey key, RequestFingerprint fingerprint) {
    String redisKey = key.toRedisKey();
    String value = json.writeValueAsString(IdempotencyRecord.inProgress(fingerprint, clock.instant()));

    SetArgs args = SetArgs.Builder.nx().px(Duration.ofMinutes(5).toMillis());
    String result = redisCommands.set(redisKey, value, args);

    if ("OK".equals(result)) {
        return ClaimResult.claimed();
    }

    IdempotencyRecord existing = readRecord(redisKey);
    return ClaimResult.alreadyExists(existing);
}

This is good for initial claim. It is not sufficient for finalization because finalization must validate owner/state.


7. Why the In-Progress TTL Matters

If a worker crashes after claiming the key, the marker remains.

TTL defines the maximum lockout window.

Too short:

original work still running, marker expires, duplicate starts second side effect

Too long:

crashed work blocks legitimate retry for too long

Choose TTL from:

max expected processing duration + retry jitter + operational repair margin

For payment/order workflows:

  • do not rely only on Redis TTL
  • use DB unique constraint or provider idempotency when possible
  • use reconciliation if uncertainty remains

8. Result Replay Pattern

When command completes, store the final result.

idem:payment:{tenant-a:user-42}:payreq_abc123 = COMPLETED envelope

Envelope:

{
  "state": "COMPLETED",
  "requestHash": "sha256:abc",
  "resourceType": "payment",
  "resourceId": "pay_789",
  "statusCode": 201,
  "responseBody": {
    "paymentId": "pay_789",
    "status": "AUTHORIZED"
  },
  "startedAtEpochMs": 1783000000000,
  "completedAtEpochMs": 1783000001500
}

Duplicate behavior:

same key + same request hash + completed -> return stored response
same key + different request hash -> 409 conflict
same key + in progress -> 202/409/wait bounded
same key + failed final -> replay stored failure

Why Store Response?

Without response replay, duplicate callers may see inconsistent outcomes:

  • original got 201 but retry gets 409
  • original created resource but retry says duplicate without resource id
  • client cannot recover after timeout

Idempotency should be caller-friendly:

same command identity => same observable result where possible

9. Atomic Finalization with Lua

Finalization should verify:

  • key exists
  • state is IN_PROGRESS
  • request hash matches
  • optional owner token matches
  • then replace with completed result and longer TTL

Why owner token?

If an in-progress marker expires and another worker claims the same key, the old worker must not finalize over the new claim.

Claim envelope:

{
  "state": "IN_PROGRESS",
  "requestHash": "sha256:abc",
  "ownerToken": "worker-token-123",
  "startedAtEpochMs": 1783000000000
}

Lua finalization sketch:

-- KEYS[1] = idempotency key
-- ARGV[1] = expected request hash
-- ARGV[2] = expected owner token
-- ARGV[3] = completed envelope json
-- ARGV[4] = completed ttl seconds

local current = redis.call('GET', KEYS[1])
if not current then
  return 'MISSING'
end

if string.find(current, '"state"%s*:%s*"IN_PROGRESS"') == nil then
  return 'NOT_IN_PROGRESS'
end

if string.find(current, '"requestHash"%s*:%s*"' .. ARGV[1] .. '"') == nil then
  return 'HASH_MISMATCH'
end

if string.find(current, '"ownerToken"%s*:%s*"' .. ARGV[2] .. '"') == nil then
  return 'OWNER_MISMATCH'
end

redis.call('SET', KEYS[1], ARGV[3], 'EX', ARGV[4])
return 'COMPLETED'

Production improvement:

  • avoid fragile JSON parsing in Lua for hot paths
  • store state/hash/token in a Redis Hash
  • use a Lua script over hash fields

Hash layout:

HSET idem:payment:{scope}:K state IN_PROGRESS requestHash ... ownerToken ... startedAt ...
EXPIRE idem:payment:{scope}:K 300

Then finalization script uses HGET.


10. Hash-Based Idempotency Record

Redis Hash is often better than JSON string for idempotency metadata.

Fields:

state
requestHash
ownerToken
startedAtEpochMs
completedAtEpochMs
statusCode
resourceType
resourceId
responseBody
errorCode

Claim script:

-- KEYS[1] = idempotency key
-- ARGV[1] = request hash
-- ARGV[2] = owner token
-- ARGV[3] = now epoch ms
-- ARGV[4] = in-progress ttl seconds

if redis.call('EXISTS', KEYS[1]) == 1 then
  return 'EXISTS'
end

redis.call('HSET', KEYS[1],
  'state', 'IN_PROGRESS',
  'requestHash', ARGV[1],
  'ownerToken', ARGV[2],
  'startedAtEpochMs', ARGV[3]
)
redis.call('EXPIRE', KEYS[1], ARGV[4])
return 'CLAIMED'

Complete script:

-- KEYS[1] = idempotency key
-- ARGV[1] = request hash
-- ARGV[2] = owner token
-- ARGV[3] = completed epoch ms
-- ARGV[4] = status code
-- ARGV[5] = resource id
-- ARGV[6] = response body json
-- ARGV[7] = completed ttl seconds

local state = redis.call('HGET', KEYS[1], 'state')
if not state then return 'MISSING' end
if state ~= 'IN_PROGRESS' then return state end

if redis.call('HGET', KEYS[1], 'requestHash') ~= ARGV[1] then
  return 'HASH_MISMATCH'
end

if redis.call('HGET', KEYS[1], 'ownerToken') ~= ARGV[2] then
  return 'OWNER_MISMATCH'
end

redis.call('HSET', KEYS[1],
  'state', 'COMPLETED',
  'completedAtEpochMs', ARGV[3],
  'statusCode', ARGV[4],
  'resourceId', ARGV[5],
  'responseBody', ARGV[6]
)
redis.call('EXPIRE', KEYS[1], ARGV[7])
return 'COMPLETED'

This is clearer, more efficient for metadata checks, and easier to inspect during incidents.


11. Request Hash Conflict

Same idempotency key with different payload is not a duplicate. It is a conflict.

Example:

Request A:
Idempotency-Key: K
amount: 100000
currency: IDR

Request B:
Idempotency-Key: K
amount: 200000
currency: IDR

Do not replay A for B. Return conflict:

HTTP/1.1 409 Conflict
Content-Type: application/json

{
  "error": "IDEMPOTENCY_KEY_REUSED_WITH_DIFFERENT_REQUEST",
  "message": "The provided idempotency key was already used for a different request payload."
}

Canonicalize before hashing:

  • sort JSON object fields
  • normalize numeric representation
  • exclude volatile headers
  • include tenant/account scope
  • include method/path/operation
  • include critical body fields
  • include API version if semantics changed

Do not hash raw JSON string directly if clients can vary field order or whitespace.


12. Idempotency Key Scope

Key scope determines what duplicates are considered identical.

Bad:

idem:{idempotencyKey}

Collision risk across tenants/users/operations.

Better:

idem:{operation}:{tenantId}:{actorId}:{idempotencyKey}

Cluster-friendly:

idem:payment-create:{tenant-a:user-42}:payreq_abc123

Scope dimensions:

DimensionReason
tenantprevent cross-tenant collision
actor/accountprevent user key collision
operationsame key can be used for different endpoints
API versionrequest semantics may change
resource parente.g. order under account/cart
regionif Redis is region-local

Invariant:

Two different business intents must not share the same idempotency record.


13. Choosing Idempotency TTL

TTL is not only memory cleanup. It is the dedup correctness window.

idempotency TTL >= maximum retry horizon

Consider:

  • client retry duration
  • load balancer timeout
  • mobile offline retry
  • webhook provider retry schedule
  • message broker redelivery window
  • business dispute/reconciliation period
  • operational replay jobs

Examples:

WorkflowTTL
UI form submit10 minutes - 24 hours
payment create24 hours - several days, often DB-backed longer
order submit24 hours - 7 days
webhook provider eventprovider retry horizon + margin
internal event dedupretention/replay window
high-volume telemetry dedupseconds-minutes
approximate click dedupminutes-hours

Memory impact:

memory = records_per_second * ttl_seconds * avg_record_size

At 1,000 requests/sec, 24-hour TTL, and 1 KB record:

1000 * 86400 * 1024 ~= 88 GB raw payload before Redis overhead

Therefore high-throughput idempotency needs:

  • compact records
  • DB-backed final result for critical commands
  • shorter Redis TTL where acceptable
  • sharding/cluster capacity planning
  • cleanup and compression strategy

14. Redis Alone vs DB Unique Constraint

Redis is fast, but not always sufficient.

RequirementRedis onlyDB uniqueness
low-latency duplicate suppressionexcellentok
survives Redis data lossno, unless persistent and restoredyes
transactional with business rownoyes
long-term auditweakstrong
cross-region strong uniquenessweakdepends on DB architecture
high throughput temporary windowexcellentcan be expensive

For critical workflows:

Redis idempotency guard + DB unique constraint = better than either alone

Example DB table:

CREATE TABLE payment_request_idempotency (
    tenant_id        text NOT NULL,
    actor_id         text NOT NULL,
    idempotency_key  text NOT NULL,
    request_hash     text NOT NULL,
    payment_id       text,
    state            text NOT NULL,
    created_at       timestamptz NOT NULL,
    completed_at     timestamptz,
    PRIMARY KEY (tenant_id, actor_id, idempotency_key)
);

Redis handles fast path. DB enforces durable uniqueness.


15. Side Effects and Commit Ordering

The hardest part is not claiming the idempotency key. The hardest part is side-effect ordering.

Example payment flow:

Failure windows:

WindowFailureRisk
after Redis claim, before PSPstuck in progressretry blocked until TTL
after PSP success, before DB insertexternal charge exists but DB missingreconciliation needed
after DB insert, before Redis completeduplicate retry sees in-progresscan recover from DB
after Redis complete, before client responseretry can replaygood

Correctness strategy:

  • external provider should also receive idempotency key if supported
  • DB must record provider transaction ID uniquely
  • Redis completion can be rebuilt from DB if missing
  • reconciliation job handles uncertain side effects

Redis is part of the workflow. It is not the entire correctness proof.


16. Recovering from IN_PROGRESS

When duplicate arrives and state is IN_PROGRESS, options:

PolicyBehavior
return 409 Conflictclient retries later
return 202 Acceptedoperation still processing
wait boundeduseful for quick operations
lookup DB by idempotency keyrecover completed result
steal after timeoutrisky unless owner/lease model supports it
mark retryable failurerepair job transitions state

Recommended for command APIs:

1. If IN_PROGRESS and young: return 202/409 with Retry-After.
2. If IN_PROGRESS and old: check authoritative DB.
3. If DB has completed result: update Redis to COMPLETED and replay.
4. If DB has no result: allow retry only if side effects are safe or reconciled.

Java sketch:

public IdempotencyDecision handleExisting(IdempotencyRecord record, Command command) {
    if (!record.requestHash().equals(command.fingerprint())) {
        return IdempotencyDecision.conflict();
    }

    if (record.isCompleted()) {
        return IdempotencyDecision.replay(record.response());
    }

    if (record.isInProgress() && record.age(clock).compareTo(Duration.ofSeconds(30)) < 0) {
        return IdempotencyDecision.processing(Duration.ofSeconds(2));
    }

    Optional<CommandResult> recovered = authoritativeStore.findByIdempotencyKey(command.scope(), command.key());
    if (recovered.isPresent()) {
        idempotencyStore.completeFromRecoveredResult(command.key(), recovered.get());
        return IdempotencyDecision.replay(recovered.get().response());
    }

    return IdempotencyDecision.retryAllowedWithCaution();
}

17. Webhook Deduplication

Webhook providers often deliver the same event multiple times.

Dedup key:

dedup:webhook:{provider}:{tenant}:event:{eventId}

Basic pattern:

SET dedup:webhook:{stripe}:{tenant-a}:event:evt_123 seen NX EX 604800

If result is OK, process. If not, duplicate; acknowledge without processing.

Important ordering:

Validate webhook signature before trusting event identity for security decisions.

But dedup sometimes happens before expensive processing after minimal safe parsing. The exact ordering depends on threat model.

For financial events, store event ID in DB too. Redis dedup TTL only covers recent duplicates.


18. Event Consumer Deduplication

Message brokers may redeliver. Consumers must be idempotent.

Dedup key:

dedup:event:{consumerName}:{topic}:{partition}:{offset}

But offset-based dedup only identifies delivery, not business event identity. Often better:

dedup:event:{consumerName}:{eventType}:{eventId}

Rules:

  • dedup per consumer, not globally, if multiple consumers need to process same event
  • dedup event ID, not payload hash, when event ID is reliable
  • store processing state for long-running handlers
  • commit broker offset only after side effects are safe

Consumer Flow

If side effect succeeds but Redis fails before marking processed, duplicate may process again. Therefore the side effect itself should also be idempotent where important.


19. Dedup Before or After Processing?

Two common patterns:

Claim Before Processing

SET dedup key NX
process side effect

Benefit:

  • prevents concurrent duplicate processing

Risk:

  • crash after claim before processing may suppress legitimate retry until TTL

Mark After Processing

process side effect
SET dedup key

Benefit:

  • no false suppression before work

Risk:

  • concurrent duplicates can process simultaneously

Better for serious workflows:

IN_PROGRESS -> COMPLETED state machine

or use DB transaction:

INSERT processed_event(event_id) unique
apply side effect in same transaction
commit

Redis claim is best when:

  • side effect is safe to retry
  • short duplicate window is enough
  • low latency matters
  • DB unique table is too expensive for all events

DB idempotency is best when:

  • side effect must be durable and exactly once within DB boundary
  • audit is required
  • long replay windows exist

20. Approximate Deduplication

Not all dedup requires exactness.

Examples:

  • click deduplication
  • impression fraud filtering
  • telemetry dedup
  • analytics event collapse
  • repeated search query suppression

Exact set:

SADD dedup:clicks:2026-07-02 fingerprint

Problem:

memory grows with number of unique fingerprints

Alternatives:

  • bitmap with hashed position
  • Bloom filter
  • Cuckoo filter
  • Count-Min Sketch for frequency

Trade-off:

StructureFalse positivesFalse negativesUse case
Setnonoexact dedup, bounded volume
Bitmap hashyespossible by collisionrough dedup
Bloom filteryesno for inserted item under normal assumptionslarge-scale membership
Cuckoo filteryesno under normal assumptions, supports deletion in some variantsdynamic membership
Count-Min Sketchoverestimates countno undercount in modelfrequency estimate

Approximate dedup must never be used when false positive means dropping a payment/order/security event.


21. Idempotency for REST APIs

HTTP semantics matter.

MethodNatural idempotency
GETshould be safe/idempotent
PUTusually idempotent if replacing resource
DELETEusually idempotent depending response semantics
POSToften not idempotent without key
PATCHdepends on operation semantics

Idempotency key is most useful for POST commands:

POST /orders
Idempotency-Key: ordreq_123

Response cases:

Existing stateResponse
absentprocess normally
in progress same hash202 or 409 with Retry-After
completed same hashreplay stored status/body
failed final same hashreplay stored failure
same key different hash409 conflict
expiredpolicy-specific; often treat as new or reject

Header design:

Idempotency-Key: <client-generated-key>
Idempotency-Replayed: true
Retry-After: 2

Do not require clients to infer replay from response body only.


22. Idempotency for Internal Commands

Internal services need idempotency too.

Example command:

{
  "commandId": "cmd_123",
  "commandType": "ApproveCaseEscalation",
  "tenantId": "tenant-a",
  "caseId": "case-7",
  "actorId": "user-42",
  "expectedCaseVersion": 18
}

Key:

idem:command:ApproveCaseEscalation:{tenant-a:case-7}:cmd_123

Important:

  • command ID dedups retries
  • expected version protects business race
  • DB state transition should still be conditional

Example DB update:

UPDATE case_escalation
SET status = 'APPROVED', version = version + 1
WHERE case_id = :caseId
  AND status = 'PENDING_APPROVAL'
  AND version = :expectedVersion;

Redis prevents duplicate command work. DB guards business state transition.


23. Idempotency and State Machines

For lifecycle systems, idempotency should align with state transitions.

Bad thinking:

If duplicate command arrives, ignore it.

Better thinking:

If duplicate command arrives, return the current result of the same transition attempt.

Example case lifecycle:

SubmitCase idempotency:

  • same command ID, same payload: replay submitted result
  • different command ID but case already submitted: return domain conflict or current state depending API contract
  • same command ID with different payload: idempotency conflict

This distinction matters:

ScenarioMeaning
same idempotency keyretry of same intent
different key, same desired transitionseparate intent; may be conflict or no-op
current state already advancedcommand may be obsolete
expected version mismatchconcurrency conflict

Redis idempotency does not replace domain state validation.


24. Rate of Duplicate Detection vs Memory

High-volume dedup needs capacity math.

Formula:

records = events_per_second * dedup_window_seconds
memory ~= records * average_record_overhead

Example:

50,000 events/sec
1 hour dedup window
180,000,000 records

Exact Redis keys may be too expensive.

Better options:

  • bucketed sets per minute/hour
  • Bloom filter per time bucket
  • Redis Cluster sharding
  • event log compaction elsewhere
  • DB uniqueness only for critical subset
  • dedup at producer plus consumer

Bucketed key:

dedup:webhook:stripe:2026-07-02T15:00

Member:

eventId

Set TTL:

EXPIRE dedup:webhook:stripe:2026-07-02T15:00 604800

This reduces key count but set cardinality must be monitored.


25. Sharding Idempotency Keys

In Redis Cluster, keys are automatically distributed by hash slot. But hot scope can still create hot shards.

Bad if all operations use same hash tag:

idem:payment:{tenant-a}:K1
idem:payment:{tenant-a}:K2
idem:payment:{tenant-a}:K3

All keys with {tenant-a} go to one slot.

Better if per actor/order locality is not needed:

idem:payment:{hash(K)}:tenant-a:user-42:K

But if multi-key Lua needs related keys together, use a stable hash tag for that idempotency group:

idem:payment:{tenant-a:user-42:payreq_abc123}:record
idem:payment:{tenant-a:user-42:payreq_abc123}:lock

Rule:

Use hash tags for multi-key atomicity, not casually for readability.


26. Handling Redis Persistence and Data Loss

If Redis loses idempotency records, duplicates may pass through.

Risk depends on workflow:

WorkflowRedis data loss impact
UI duplicate formmaybe duplicate record
paymentpossible double charge unless DB/provider protects
webhookreprocess recent event
analyticsduplicate metric
order creationduplicate order unless DB uniqueness protects

Mitigation:

  • enable appropriate Redis persistence if Redis is part of correctness window
  • use DB unique constraints for irreversible side effects
  • use provider idempotency key downstream
  • make consumers idempotent at state transition level
  • reconcile external side effects

Honest rule:

If losing Redis idempotency keys would cause unacceptable damage, Redis cannot be the only guard.


27. Multi-Region Idempotency

Multi-region active-active systems complicate idempotency.

If each region has local Redis:

same idempotency key can be claimed in two regions concurrently

Options:

ApproachTrade-off
route same key to same home regionsimpler, adds routing dependency
globally replicated datastorestronger, higher latency/complexity
downstream provider idempotencyuseful for external side effects
DB global uniquenessdepends on DB architecture
accept duplicate and reconcileonly for low-risk workflows

For payment/order creation, prefer a single authority for the idempotency decision. Regional Redis alone is not enough for global uniqueness.


28. Client Contract

Clients must participate.

API documentation should say:

For POST /payments, clients must send Idempotency-Key.
The key must be unique per intended payment creation attempt.
Retries of the same attempt must reuse the same key.
Do not reuse a key with a different payload.
Keys are retained for 48 hours.
During processing, duplicate requests may return 202 with Retry-After.
After completion, duplicate requests return the original response with Idempotency-Replayed: true.

Without a clear client contract, the backend cannot infer intent reliably.

Bad client behavior:

  • generate new key on retry
  • reuse one key for all requests
  • change payload while reusing key
  • retry forever beyond retention window
  • hide timeout/error from user but keep retrying in background

Server should detect and reject the most dangerous cases.


29. Observability

Metrics:

idempotency_claim_total{operation,status}
idempotency_existing_total{operation,state}
idempotency_replay_total{operation}
idempotency_conflict_total{operation}
idempotency_in_progress_total{operation}
idempotency_recovery_total{operation,status}
idempotency_record_age_seconds{operation,state}
idempotency_finalize_total{operation,status}
idempotency_redis_error_total{operation,phase}
dedup_claim_total{consumer,status}
dedup_duplicate_total{consumer,eventType}

Logs should include:

operation
idempotencyKey hash, not raw if sensitive
tenant/account
requestHash
state
ownerToken hash
resourceId
replay flag
conflict reason

Tracing:

span: idempotency.claim
span attributes:
  idempotency.operation
  idempotency.status
  idempotency.state
  idempotency.replayed

Alert examples:

  • conflict rate spike
  • in-progress age exceeds threshold
  • finalize owner mismatch spike
  • Redis error during claim
  • dedup duplicate rate spike
  • replay rate unusual drop after Redis flush

30. Testing Idempotency

You need tests for duplicates, races, timeouts, and partial failure.

Core Test Cases

TestExpected outcome
same request repeated after completionsame response replayed
same key different payload409 conflict
concurrent same key requestsone processes, others wait/202/replay
crash after claim before side effectretry blocked until repair/TTL
crash after side effect before Redis completionretry recovers from DB/provider
Redis unavailable during claimpolicy-specific fail closed/open
finalization with wrong owner tokenrejected
marker expires while worker still runningstale worker cannot finalize
webhook duplicateonly one processing side effect
event redeliveryidempotent consumer behavior

Concurrent Test Sketch

@Test
void onlyOneConcurrentRequestExecutesSideEffect() throws Exception {
    int concurrency = 20;
    ExecutorService executor = Executors.newFixedThreadPool(concurrency);
    CountDownLatch start = new CountDownLatch(1);
    AtomicInteger sideEffects = new AtomicInteger();

    List<Future<ApiResponse>> futures = IntStream.range(0, concurrency)
        .mapToObj(i -> executor.submit(() -> {
            start.await();
            return paymentService.createPayment(
                new CreatePaymentRequest("payreq_abc", 100_000),
                () -> sideEffects.incrementAndGet()
            );
        }))
        .toList();

    start.countDown();

    List<ApiResponse> responses = new ArrayList<>();
    for (Future<ApiResponse> future : futures) {
        responses.add(future.get(5, TimeUnit.SECONDS));
    }

    assertThat(sideEffects.get()).isEqualTo(1);
    assertThat(responses).allMatch(r -> r.statusCode() == 201 || r.statusCode() == 202);
}

Failure Injection

Inject failures at these points:

before Redis claim
immediately after Redis claim
after DB insert before external call
after external call before DB insert
after DB commit before Redis completion
after Redis completion before HTTP response

If you cannot describe expected behavior for each point, the design is incomplete.


31. Security Considerations

Idempotency keys can become an attack surface.

Risks:

RiskMitigation
key guessinguse sufficiently random client keys
cross-tenant replayinclude tenant/account in key scope
payload substitutionstore request hash and reject mismatch
sensitive response in Redisencrypt/minimize stored response
key floodingrate limit and max key length
memory exhaustionTTL, quotas, max payload size
raw key leakage in logshash or redact keys

Do not store full sensitive payloads in idempotency records unless necessary. Prefer storing:

statusCode
resourceId
minimal response fields
requestHash
state metadata

Then reconstruct full response from authoritative source when possible.


32. Production Design Patterns

Pattern A — Lightweight API Idempotency

Use for moderate-risk POST actions.

Redis SET NX in-progress
process command
store completed response in Redis
replay duplicates

Good for:

  • profile update commands
  • support ticket creation
  • notification send request
  • non-financial resource creation

Not enough for:

  • money movement
  • irreversible external side effects

Pattern B — Redis + DB Idempotency

Use for critical workflows.

Redis fast claim
DB idempotency table unique constraint
side effect with provider idempotency key
Redis completion record

Good for:

  • payment
  • order creation
  • subscription change
  • regulatory case transition

Pattern C — Event Consumer Dedup

Use for at-least-once messages.

Redis dedup quick check
DB transition idempotent by eventId/sourceVersion
ack only after safe side effect

Good for:

  • projections
  • notification processing
  • external webhook ingestion

Pattern D — Approximate Dedup

Use for analytics/noisy data.

Bloom/bitmap/time-bucketed structures
accept false positive risk
never use for critical commands

Good for:

  • clicks
  • impressions
  • telemetry
  • recommendation signal cleanup

33. Decision Matrix

ScenarioRedis patternExtra guard
client retries POST create orderidempotency state machineDB unique table
payment authorizationRedis guardprovider idempotency + DB unique + reconciliation
webhook duplicate eventSET NX or state machineDB processed event for critical providers
Kafka-like consumer redeliveryper-consumer dedup keyidempotent DB transition
temporary duplicate button clickSET NX short TTLUX disable button
notification send onceSET NX per notification IDprovider message ID log
high-volume telemetryapproximate dedupaccept false positives
case lifecycle commandcommand idempotency keyexpected version/state transition guard

34. Common Anti-Patterns

Anti-pattern 1 — “We Have Redis SET NX, So It Is Exactly Once”

No. SET NX only claims a key in Redis. It does not atomically include DB commit, external side effect, message ack, and HTTP response.

Anti-pattern 2 — No Request Hash

Same idempotency key with different payload silently replays old response. This is data corruption from the caller’s perspective.

Anti-pattern 3 — TTL Too Short for Real Retry Window

If mobile clients retry for 24 hours but Redis TTL is 10 minutes, duplicates can pass after 10 minutes.

Anti-pattern 4 — Redis as Only Guard for Money Movement

If Redis loses keys or expires them too early, duplicate money movement may occur. Use durable uniqueness and provider idempotency.

Anti-pattern 5 — One Global Idempotency Key Namespace

Cross-tenant or cross-operation collision is possible. Always scope keys.

Anti-pattern 6 — Ignoring IN_PROGRESS

Returning generic duplicate error for in-progress commands leaves clients stuck. Provide retry semantics or recovery.

Anti-pattern 7 — Dedup Per Broker Offset Only

Offset dedup does not prevent the same business event from being published twice with different offsets. Use event ID when available.


35. Engineering Checklist

Before shipping Redis-backed idempotency:

  • Operation requiring idempotency is explicit.
  • Client contract is documented.
  • Idempotency key scope includes tenant/account/operation.
  • Request fingerprint is canonicalized and stored.
  • Same key different payload returns conflict.
  • State machine includes in-progress and completed states.
  • In-progress TTL is based on real processing time.
  • Completion TTL covers retry horizon.
  • Duplicate completed request replays result.
  • Duplicate in-progress request has deterministic behavior.
  • Owner token prevents stale finalization.
  • Redis failure policy is defined.
  • Source/DB recovery path exists for critical workflows.
  • DB/provider uniqueness exists where side effects are irreversible.
  • Webhook/event dedup is per consumer where needed.
  • Metrics cover claim, replay, conflict, in-progress, recovery.
  • Failure injection tests cover partial commits and timeouts.
  • Sensitive data is not over-stored in Redis.

36. Mental Model: The Idempotency Boundary

Draw this for every workflow:

Each boundary has different guarantees. Redis usually protects one boundary. It does not automatically protect all boundaries.

For example:

API Redis idempotency prevents two API handlers from starting the same command.
DB uniqueness prevents duplicate durable command records.
Provider idempotency prevents duplicate external charge.
Consumer dedup prevents duplicate projection side effects.

A robust system layers idempotency boundaries.


37. Part Summary

Redis is excellent for fast, atomic, TTL-bound duplicate suppression. But idempotency is a workflow design problem, not only a Redis command problem.

Key takeaways:

  • Idempotency is about same intent producing same effect/result under retry.
  • Deduplication is only one mechanism inside idempotency.
  • Use client-provided idempotency keys for retryable POST commands.
  • Scope keys by tenant/account/operation.
  • Store canonical request hash and reject mismatches.
  • Use an explicit state machine: in-progress, completed, failed, expired.
  • Use SET NX PX or Lua for atomic claim.
  • Use owner tokens to prevent stale finalization.
  • Replay completed responses for duplicate successful requests.
  • TTL defines the dedup correctness window, not just memory cleanup.
  • Redis alone is not enough for irreversible side effects like payment.
  • Combine Redis with DB uniqueness, provider idempotency, and reconciliation where needed.
  • Event/webhook dedup should use event identity and per-consumer scope.
  • Approximate dedup is useful only when false positives are acceptable.
  • Test concurrency, crash windows, Redis loss, and replay behavior.

Next part:

Part 017 — Rate Limiting and Quota Enforcement


References

Lesson Recap

You just completed lesson 16 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.