Series/Learn Java Authentication Pattern

Final StretchOrdered learning track

Authentication Performance Engineering

Learn Java Authentication Pattern - Part 037

Performance engineering untuk Java authentication systems: latency budget, capacity model, password hash cost, token validation, introspection, session store, Redis, JWKS cache, rate limiter, thread pool, backpressure, SLO, benchmark, load testing, dan failure mode produksi.

[2026-07-03]10 min read1814 words

In This Lesson

1. Mental model: authentication has two performance classes 2. Performance objective: security first, then predictable latency 3. Latency budget model

PrevNext

Lesson 3740 lesson track34–40 Final Stretch

#java#authentication#performance#capacity-planning+11 more

Part 037 — Authentication Performance Engineering

Target part ini: memahami authentication sebagai sistem performa kritis. Kita akan membedakan jalur auth yang memang harus mahal, seperti password hashing, dari jalur auth yang harus sangat murah, seperti request token validation. Tujuannya bukan membuat auth “secepat mungkin”, tetapi membuatnya aman, stabil, terukur, dan tidak menjadi titik runtuh seluruh platform.

Authentication punya karakter performa yang unik.

Di satu sisi, beberapa operasi sengaja dibuat mahal:

password verification
TOTP verification with abuse control
refresh token rotation with reuse detection
risk scoring
remote token introspection
federated login callback

Di sisi lain, beberapa operasi terjadi di hampir semua request dan harus murah:

session lookup
JWT signature verification
SecurityContext reconstruction
tenant resolution
authorization pre-check dependency
audit event enqueue

Kesalahan umum engineer adalah menyamakan semuanya sebagai “auth overhead”. Itu terlalu kasar.

Authentication performance engineering harus menjawab pertanyaan yang lebih presisi:

Which auth path is hot?
Which auth path is intentionally expensive?
Which path can be cached?
Which path must never be cached?
Which dependency is allowed to fail closed?
Which dependency must degrade gracefully?
Which latency spike can cause login storm?
Which CPU spike can become denial-of-service?

Kita tidak sedang mengoptimasi login page. Kita sedang melindungi control plane identitas.

1. Mental model: authentication has two performance classes

Pisahkan authentication path menjadi dua kelas besar.

Class A — request-path authentication
Runs on many or every API request.
Must be cheap, deterministic, and cache-aware.
Examples: JWT validation, session lookup, API key lookup, mTLS principal extraction.

Class B — ceremony authentication
Runs during explicit auth ceremonies.
May be expensive, stateful, and abuse-controlled.
Examples: password login, OAuth callback, MFA challenge, refresh rotation, recovery.

Diagramnya:

Invariant:

Request-path auth should not perform unpredictable expensive work.
Ceremony auth must be protected from concurrency, abuse, and resource exhaustion.

Kalau setiap API request melakukan remote introspection, database join besar, risk scoring penuh, atau password hash verification, sistem akan runtuh sebelum domain business code berjalan.

Kalau login ceremony tidak dibatasi, attacker bisa memakai password hashing sebagai CPU exhaustion primitive.

2. Performance objective: security first, then predictable latency

Tujuan performa auth bukan sekadar rendahnya latency.

Tujuan sebenarnya:

1. Security invariant tetap benar.
2. Auth path punya bounded resource usage.
3. Latency distribution dapat diprediksi.
4. Dependency failure tidak meruntuhkan seluruh platform.
5. Incident response dapat dilakukan tanpa full outage.

Contoh trade-off:

Area	Faster choice	Safer / more controllable choice
Password hashing	Low work factor	Tuned adaptive cost + bounded executor
Token validation	Trust unsigned/weak token	Verify signature, issuer, audience, expiry
Opaque token	Cache forever	Bounded TTL + revocation semantics
Session store	Local memory only	Redis/DB with explicit failover model
JWKS	Fetch on every request	Cache with refresh + key rotation policy
Rate limiter	No limiter	Multi-dimensional limiter with graceful errors

Performance engineering auth selalu berada di tension:

Too cheap => weak against attackers.
Too expensive => weak against denial-of-service.

Top engineer tidak bertanya “berapa cepat login?”. Ia bertanya “berapa biaya terburuk yang bisa dipaksa attacker per request?”.

3. Latency budget model

Mulai dari budget.

Contoh target internal:

GET /api/orders/{id}
P95 total latency: 120 ms
P99 total latency: 250 ms

Auth budget inside request:
P95: <= 5 ms
P99: <= 15 ms

Contoh login:

POST /login
P95 total latency: 500 ms
P99 total latency: 1500 ms

Password verification budget:
100 ms - 500 ms depending policy and hardware

Kenapa budget login boleh lebih mahal?

Karena password hashing memang defensive cost. Tetapi karena mahal, ia harus diberi pagar:

rate limit
bounded executor
queue limit
per-account throttling
per-IP/client throttling
synthetic verification budget
circuit breaker for downstream dependencies

Model sederhana:

request_latency = auth_latency + authorization_latency + business_latency + persistence_latency + serialization_latency

Untuk request-path auth:

auth_latency = parse + local_validation + context_mapping + optional_cache_lookup

Untuk login ceremony:

login_latency = limiter + account_lookup + hash_verify + state_transition + session_issue + audit_enqueue

Jangan ukur login dengan rata-rata. Ukur distribusi.

P50 tells you normal user experience.
P95 tells you capacity pressure.
P99 tells you tail risk.
Max tells you incident symptom.

4. Capacity model: authentication as a bottleneck

Kapasitas auth sering tidak sama dengan kapasitas API biasa.

Misal:

Password hash cost: 250 ms CPU-equivalent
Available login worker threads: 32
Theoretical max login verifications: 32 / 0.25 = 128 verifications/sec

Itu sebelum overhead database, Redis, audit, network, GC, dan contention.

Maka kapasitas aman mungkin:

safe_capacity = theoretical_capacity * safety_factor
safe_capacity = 128 * 0.5 = 64 login/sec

Jika bot mengirim 1000 login/sec, tanpa limiter sistem akan antri, latency naik, thread habis, dan endpoint lain ikut terdampak.

Karena itu password verification harus dibulkhead.

Invariant:

An attacker must not be able to consume unbounded CPU by forcing password verification.

5. Password hashing performance

Password hashing adalah bagian paling mudah disalahpahami.

Hash password harus cukup mahal untuk memperlambat offline cracking jika database credential bocor. Tetapi verifier juga harus mampu melayani user valid tanpa menjadi DoS amplifier.

Spring Security menyebut bcrypt, PBKDF2, scrypt, dan argon2 sebagai contoh adaptive one-way functions. Spring juga menekankan bahwa work factor perlu disesuaikan dengan sistem karena performa sangat bergantung pada hardware.

OWASP Password Storage Cheat Sheet merekomendasikan password disimpan dengan slow hashing seperti Argon2id, bcrypt, atau PBKDF2, bukan fast hash seperti SHA-256 langsung.

Mental model:

password_hash_cost is a security control
password_hash_executor is a capacity control
rate_limiter is an abuse control
rehash_policy is a migration control

5.1 Jangan jalankan password hashing sembarangan di request thread pool

Jika semua request memakai thread pool yang sama:

/login password verification consumes all web threads
/api/health fails
/api/orders fails
metrics scrape fails
load balancer marks instance unhealthy
traffic shifts to other pods
other pods collapse

Gunakan bounded executor khusus.

import java.util.concurrent.*;

public final class PasswordHashExecutor {
    private final ThreadPoolExecutor executor;

    public PasswordHashExecutor(int workers, int queueSize) {
        this.executor = new ThreadPoolExecutor(
            workers,
            workers,
            0L,
            TimeUnit.MILLISECONDS,
            new ArrayBlockingQueue<>(queueSize),
            r -> {
                Thread t = new Thread(r);
                t.setName("password-hash-worker");
                t.setDaemon(true);
                return t;
            },
            new ThreadPoolExecutor.AbortPolicy()
        );
    }

    public <T> CompletableFuture<T> submit(Callable<T> task) {
        CompletableFuture<T> result = new CompletableFuture<>();
        try {
            executor.submit(() -> {
                try {
                    result.complete(task.call());
                } catch (Throwable ex) {
                    result.completeExceptionally(ex);
                }
            });
        } catch (RejectedExecutionException ex) {
            result.completeExceptionally(new AuthCapacityExceededException());
        }
        return result;
    }
}

Policy:

Queue full must not produce a distinct account-specific error.
Capacity failure must be logged as system event.
User response should stay generic.

5.2 Tune cost with benchmark, not folklore

Bad policy:

bcrypt 12 because blog said so
argon2 defaults because library said so
PBKDF2 iterations copied from old project

Better policy:

Benchmark on production-like hardware.
Choose target verification latency range.
Reserve CPU for peak login and attack conditions.
Document chosen parameters.
Re-evaluate during hardware/runtime upgrade.

Example benchmark dimensions:

algorithm
parameters
JDK version
container CPU limit
node CPU type
memory limit
concurrency
P50/P95/P99 verify latency
CPU saturation point
GC impact

Parameter decision record:

## Password Hash Parameter ADR

Algorithm: Argon2id
Memory cost: ...
Iterations: ...
Parallelism: ...
Target P95 verification latency: ...
Measured hardware: ...
Date measured: ...
Maximum login verification concurrency per pod: ...
Reason: ...
Migration path: rehash-on-login

5.3 Synthetic verification must also be budgeted

Untuk mencegah account enumeration, sistem sering menjalankan synthetic hash verification ketika account tidak ditemukan.

Account exists: verify real password hash
Account missing: verify against synthetic hash

Ini benar secara security, tetapi mahal secara performa.

Tanpa limiter, attacker bisa mengirim username acak dan memaksa hash verification terus-menerus.

Invariant:

Enumeration defense must not become CPU exhaustion primitive.

Solusi:

rate limit before expensive verification
bounded hash executor
synthetic hash cached in memory
same general response semantics
audit high-cardinality identifier spray

Saat menaikkan cost factor, sistem dapat melakukan rehash setelah login sukses.

if (passwordEncoder.matches(rawPassword, storedHash)) {
    if (passwordEncoder.upgradeEncoding(storedHash)) {
        String newHash = passwordEncoder.encode(rawPassword);
        credentialRepository.updateHash(accountId, credentialVersion, newHash);
    }
}

Masalah:

verify old hash + encode new hash = double expensive operation

Untuk sistem high traffic, rehash perlu dirancang:

only after successful login
bounded executor
optimistic credential version update
optional async rehash with secure raw password handling constraints
progress metrics
rollback path

Hindari menyimpan raw password di queue atau message broker untuk async rehash. Itu memperbesar blast radius.

6. JWT validation performance

JWT sering dianggap “stateless dan cepat”. Itu hanya benar jika implementasinya tepat.

Hot path JWT validation:

1. Read Authorization header.
2. Parse compact token.
3. Enforce allowed algorithm.
4. Resolve key by kid.
5. Verify signature.
6. Validate issuer.
7. Validate audience.
8. Validate expiry / not-before / clock skew.
9. Map claims to principal.
10. Continue request.

Diagram:

Performance risk:

fetch JWKS on every request
parse enormous token
map huge claim graph
query DB for roles on every request
validate token against wrong tenant then retry all tenants
log token payload

6.1 JWKS cache strategy

JWKS fetch must not be on request critical path after warm-up.

Good strategy:

cache keys by issuer + kid
respect cache headers if safe
background refresh
refresh on unknown kid with rate limit
pin allowed issuer metadata
fail closed for invalid signature
avoid retry storm when IdP/JWKS endpoint is down

Bad strategy:

if kid unknown, fetch JWKS for every request
if fetch fails, accept token temporarily
if issuer unknown, try every configured issuer

Unknown kid is an attack vector for cache miss storm.

unknown_kid_rate_limiter[issuer, kid_hash, source] => allow limited refresh attempts

6.2 Token size matters

JWT is sent on every request.

Large tokens increase:

network overhead
header parsing cost
proxy/header limit risk
log leak risk
claim mapping cost
browser storage/cookie size risk

Keep access token claims minimal:

iss
sub
aud
exp
nbf
iat
jti if needed
tenant/member context if needed
coarse scopes/authorities

Do not pack entire authorization graph into JWT.

Bad: all permissions for every object
Better: coarse grant + resource-server-side authorization lookup/cache

6.3 Local JWT validation vs opaque introspection

Aspect	JWT local validation	Opaque token introspection
Hot-path latency	Low after key cache warm	Network-dependent
Revocation immediacy	Harder	Easier
Token privacy	Claims visible to holder	Claims hidden
Dependency on IdP per request	No	Yes unless cached
Operational failure mode	Key cache / rotation	Introspection outage

Decision rule:

Use local JWT validation when high request throughput and bounded token lifetime matter.
Use opaque token/introspection when immediate revocation and central control matter more.
Use caching carefully; otherwise introspection becomes IdP DDoS by design.

7. Opaque token introspection performance

Opaque tokens push validation to authorization server/introspection endpoint.

Hot path:

Risks:

introspection endpoint latency dominates API latency
IdP outage becomes API outage
attacker sends random tokens causing introspection storm
token cache stores raw token
revocation semantics broken by long cache TTL

Good practice:

hash token before cache key
do negative caching for random invalid tokens with very short TTL
use bounded connection pool to introspection endpoint
use timeout shorter than API timeout budget
cache active result no longer than acceptable revocation lag
track active/inactive/introspection_error separately

Example cache key:

sha256("opaque-token-cache:v1:" + token)

Never log raw opaque token.

8. Session store performance

Session auth shifts hot-path cost to session lookup.

Possible session store:

container memory
sticky session
Redis
relational database
hybrid cache + central store

Hot path:

cookie -> session id -> store lookup -> session record -> principal reconstruction

Risk:

Redis latency spike impacts every authenticated request
session object too large
session serialization is slow
session index updates become hot keys
logout requires slow global scan
concurrent session control does per-request write

8.1 Session object must be small

Bad session:

full User entity
roles from every tenant
preferences
cart
large serialized object graph
last activity update on every request

Good session:

sessionId
accountId
subjectId
tenantId/current tenant context
credentialVersion/authVersion
assuranceLevel
issuedAt
lastSeenAt maybe throttled
expiresAt
selected authorities snapshot or version pointer

If authorization data is large, store version pointer and cache separately.

8.2 Avoid write amplification

A common hidden cost:

Every request updates session lastAccessedTime.

In high traffic, this becomes write amplification to Redis/database.

Mitigations:

throttle last-seen updates
use sliding expiration carefully
separate idle timeout from audit heartbeat
batch/update asynchronously only when safe
avoid per-request large session serialization

8.3 Session indexing

Operations that need index:

revoke all sessions for account
revoke all sessions for tenant
list active sessions for user
concurrent session control
compromise response

Index design:

session:{sessionId} -> session record
account_sessions:{accountId} -> set of sessionIds
tenant_sessions:{tenantId} -> set of sessionIds, if operationally required

Failure mode:

session exists but index missing
index points to expired session
large tenant-wide session set causes slow revoke

Cleanup must be designed, not assumed.

9. API key lookup performance

API key verification hot path:

parse key prefix
lookup credential metadata by prefix
hash presented secret
constant-time compare
scope/client/tenant validation
rate limit/quota check

Avoid full-table secret scanning.

Good API key format:

ak_live_<public_prefix>_<secret_random>

Store:

id
public_prefix unique indexed
secret_hash
client_id
tenant_id
status
scope
created_at
rotated_at
last_used_at throttled

Performance invariant:

API key lookup must be O(1) or indexed O(log n), never scan-based.

Last-used tracking can create write pressure.

Better:

update last_used_at at most once per N minutes
emit audit event asynchronously
keep rate limiter counters in Redis or purpose-built store

10. HMAC signing performance

HMAC verification usually cheaper than password hashing, but it can become expensive if canonicalization is bad.

Hot path:

parse key id
lookup shared secret metadata
canonicalize request
compute payload hash or streaming digest
compute HMAC
constant-time compare
check timestamp/nonce
continue

Performance risks:

read huge body into memory
canonicalization allocates heavily
nonce store is unbounded
clock skew window too large
signature verification happens before body size limit

Good ordering:

1. Apply max request size.
2. Parse required signature headers.
3. Validate timestamp window.
4. Lookup key by key id.
5. Stream payload hash if needed.
6. Compute canonical signature.
7. Constant-time compare.
8. Store nonce with TTL.

Nonce store capacity:

nonce_entries = request_rate_per_client * replay_window_seconds

If client sends 100 req/s and replay window 300 seconds:

nonce_entries = 30,000 per client

For 10,000 clients, naive nonce store becomes huge.

Possible mitigations:

shorter replay window
per-client quota
nonce compaction
Bloom filter with false-positive trade-off where acceptable
idempotency key alignment

11. OAuth/OIDC callback performance

OAuth/OIDC login callback is not normally the hottest path, but it is dependency-heavy.

Operations:

state lookup
authorization code exchange
ID token validation
JWKS resolution
UserInfo call optional
account linking/provisioning
session creation
login audit

Risks:

IdP token endpoint slow
JWKS cold cache
UserInfo endpoint slow
account provisioning locks user table
email-domain tenant lookup slow
retry causes duplicate account linking

Design:

cache OIDC provider metadata/JWKS
avoid UserInfo call if ID token has required claims
make account linking idempotent
use unique constraint on issuer + subject
put provisioning behind explicit transaction boundary
separate interactive latency from async enrichment

Account linking invariant:

External identity key = issuer + subject
Never trust email alone as identity key.

12. Rate limiter performance

Rate limiter is both security control and performance dependency.

Dimensions:

source IP / network
normalized login identifier
account id if known
tenant id
client id
device fingerprint if defensible
route
credential type

Problem:

Too few dimensions => bypassable.
Too many dimensions => high cardinality and memory pressure.

Good limiter design:

cheap pre-limiter before expensive operations
stronger account limiter after account resolution
separate limiter for recovery, MFA, refresh, API key, introspection
bounded key cardinality
TTL on all counters
instrument rejection reason

Redis Lua pattern:

-- KEYS[1] = limiter key
-- ARGV[1] = max count
-- ARGV[2] = ttl seconds

local current = redis.call('INCR', KEYS[1])
if current == 1 then
  redis.call('EXPIRE', KEYS[1], ARGV[2])
end

if current > tonumber(ARGV[1]) then
  return 0
end

return 1

Be careful: a centralized Redis limiter can itself become bottleneck.

Mitigations:

local pre-filter for obvious abuse
sharded keys
bounded cardinality
separate Redis cluster/db for security counters if needed
fallback mode documented

13. Threading model in Java auth

13.1 Servlet stack

In servlet/Spring MVC apps:

request thread enters filter chain
SecurityContext is usually associated with current thread
business handler runs on same thread unless async boundary exists

Blocking auth operations consume request threads:

password hash
DB lookup
Redis lookup
remote introspection
OIDC token exchange

Use:

timeouts
connection pool limits
bulkheads
bounded executors
backpressure

13.2 Reactive stack

In reactive apps, blocking auth is more dangerous.

Bad:

Run password hash or JDBC call on event loop.

Better:

move blocking work to bounded scheduler
keep SecurityContext in reactive context, not ThreadLocal assumption
instrument scheduler queue

Even if this series mostly focuses on servlet/JAX-RS/Spring MVC patterns, the invariant is universal:

Do not run unpredictable blocking auth work on a scarce execution resource.

14. Caching strategy: what can and cannot be cached

Data	Cache?	Notes
JWKS public keys	Yes	By issuer + kid; handle rotation
OIDC discovery metadata	Yes	Refresh periodically
JWT validation result	Rarely	Usually unnecessary; risk token replay cache complexity
Opaque introspection active result	Yes, short TTL	TTL determines revocation lag
Opaque inactive result	Yes, very short TTL	Prevent random token storm
Session record	Yes	Must respect revocation/expiry
API key metadata	Yes	Must handle revocation/rotation lag
Password hash verification result	No	Credential verification must be fresh
MFA challenge result	No	Single-use/short-lived state
Risk score	Maybe	Cache signals, not final decision blindly

Cache rule:

Cache public metadata aggressively.
Cache security decisions only with explicit revocation-lag acceptance.
Never cache raw secrets.

15. Backpressure and graceful degradation

Auth dependencies fail differently.

Dependency	Failure impact	Fallback posture
Password DB	Login unavailable	Fail closed
Session Redis	Session auth unavailable or degraded	Usually fail closed for protected routes
JWKS endpoint	Existing cached keys may continue	Use cached keys until safe TTL; unknown kid fail closed
Introspection endpoint	Opaque token validation impaired	Fail closed unless explicit degraded policy exists
Audit pipeline	Security visibility reduced	Prefer local buffer/spool; do not block forever
Risk engine	Step-up quality reduced	Fall back to conservative policy
IdP token endpoint	Federated login unavailable	Existing sessions may continue

Do not invent fallback during incident. Define it before incident.

Bad fallback:

IdP is down, so temporarily skip token validation.

Good fallback:

IdP is down.
Existing sessions continue until expiry.
New federated login unavailable.
Local break-glass admin path remains protected by hardware MFA.
Unknown JWT kid fails closed.
Cached known keys valid until configured maximum stale window.

16. Metrics: what to measure

Authentication metrics should be split by path and outcome.

Request-path metrics

auth.request.count{mechanism, outcome, tenant, client_type}
auth.request.latency{mechanism}
auth.jwt.validation.latency{issuer, outcome}
auth.jwks.cache.hit{issuer}
auth.jwks.refresh.count{issuer, outcome}
auth.session.lookup.latency{store, outcome}
auth.opaque.introspection.latency{issuer, outcome}

Ceremony metrics

auth.login.count{outcome, tenant, mechanism}
auth.login.latency{outcome}
auth.password.verify.latency{algorithm, outcome}
auth.password.hash.executor.queue_depth
auth.password.hash.executor.rejected
auth.mfa.challenge.count{factor, outcome}
auth.refresh.rotation.count{outcome}
auth.rate_limit.rejected{dimension, route}

Security metrics

auth.enumeration_suspected.count
auth.credential_stuffing_suspected.count
auth.token_reuse_detected.count
auth.unknown_kid.count{issuer}
auth.invalid_audience.count
auth.invalid_issuer.count
auth.cross_tenant_rejected.count

Avoid high-cardinality labels:

Do not label metrics with raw username, email, token, session id, IP if cardinality/privacy risk is unacceptable.

Use hashed/bucketed fields in logs, not metrics labels.

17. SLO examples

Example SLOs:

Request-path authentication
99.9% of JWT validations complete under 20 ms excluding upstream network.
99.9% of session lookups complete under 25 ms.
JWKS cache hit ratio above 99.5% during steady state.

Login ceremony
99% of password login attempts complete under 1200 ms when dependencies healthy.
0 accepted sessions before full authentication completion.
Password hash executor rejection below 0.1% outside attack windows.

OAuth/OIDC
99% of callback handling completes under 2500 ms when IdP healthy.
0 callbacks accepted without matching state.
0 ID tokens accepted with invalid issuer/audience/signature/nonce.

Notice: security invariant SLOs are often zero-tolerance.

0 invalid token accepted
0 disabled account login accepted
0 cross-tenant token accepted

Latency may degrade. Security invariant must not.

18. Load testing scenarios

Do not load test only valid login.

Test matrix:

1. Valid login steady state
2. Invalid password storm against known account
3. Username spray against random identifiers
4. Credential stuffing across many accounts
5. Login + synthetic verification pressure
6. JWT valid high-throughput API requests
7. JWT unknown kid storm
8. Expired token storm
9. Opaque token random value storm
10. Session Redis latency injection
11. IdP JWKS outage during key rotation
12. Refresh token rotation race
13. MFA challenge flood
14. API key high-throughput usage
15. HMAC large payload signing

For every scenario measure:

P50/P95/P99 latency
CPU
memory
GC
thread pool usage
connection pool usage
queue depth
rate limiter rejection
business endpoint collateral damage
error semantics
security invariant

The last item matters most.

A load test that passes by accepting invalid tokens is a failed test.

19. Failure mode catalog

19.1 Hash cost too high

Symptoms:

login latency spike
CPU saturated
password hash executor queue grows
health checks fail
web request threads blocked

Controls:

bounded executor
rate limit before hash
capacity dashboard
feature flag for login challenge escalation
parameter ADR

19.2 Hash cost too low

Symptoms:

system performs well
but stolen hashes become easier to crack offline

Controls:

periodic hash parameter review
breached password defense
rehash-on-login
hardware-aware benchmark

19.3 JWKS cache miss storm

Symptoms:

unknown kid count spikes
JWKS endpoint QPS spikes
API latency increases
IdP rate limits resource servers

Controls:

unknown kid refresh limiter
issuer allowlist
cache warm-up
fail closed for unknown key
alerting

19.4 Introspection storm

Symptoms:

opaque introspection endpoint saturated
random token invalid rate high
resource servers slow

Controls:

negative cache short TTL
pre-parse token format
bounded pool
timeout
rate limit per source/client

19.5 Session store hot key

Symptoms:

Redis CPU spike
latency concentrated on session index keys
logout-all operation slow

Controls:

sharded indexes
async cleanup
bounded session listing
session record small
last-seen write throttling

19.6 Audit pipeline backpressure

Symptoms:

login path blocked on audit publish
audit queue full
dropped security events

Controls:

local bounded buffer
separate critical audit persistence
backpressure policy documented
drop policy only for non-critical telemetry

20. Implementation blueprint: auth performance guardrails

A production-grade Java auth system should have explicit guardrails:

Dedicated password hash executor
Per-route rate limiter
Per-account and per-source throttling
JWKS cache with refresh limiter
Token introspection cache with short TTL
Session store timeout and pool limit
Audit event async boundary
Metrics for every auth mechanism
Structured security logs
Capacity runbook
Load test suite

Architecture:

Key point:

Every expensive auth operation sits behind an explicit capacity boundary.

21. Production checklist

Use this checklist before deploying auth changes.

Password/auth ceremony

[ ] Password algorithm and parameters documented.
[ ] Benchmark performed on production-like CPU/memory limits.
[ ] Hash verification uses bounded executor.
[ ] Queue full behavior tested.
[ ] Rate limiter runs before expensive verification.
[ ] Synthetic verification cannot be abused unboundedly.
[ ] Rehash-on-login measured and bounded.
[ ] Login storm load test executed.

Token/resource server

[ ] JWT issuer/audience/expiry/signature validated.
[ ] JWKS cache exists and is warmed.
[ ] Unknown kid refresh is rate-limited.
[ ] Token size limits enforced.
[ ] Opaque token introspection has timeout and cache.
[ ] Invalid token storm tested.
[ ] Metrics split invalid_signature/expired/invalid_audience/unknown_kid.

Session

[ ] Session record is small.
[ ] Store timeout is below endpoint timeout.
[ ] Last-seen write amplification controlled.
[ ] Revoke-all operation tested at realistic scale.
[ ] Session store outage behavior documented.
[ ] Concurrent session control measured.

Operations

[ ] SLOs defined.
[ ] Alerts exist for latency, rejection, queue depth, unknown kid, introspection error.
[ ] Dashboards split request-path and ceremony auth.
[ ] Load tests include attack-like traffic.
[ ] Runbook includes capacity emergency steps.

22. Exercises

Exercise 1 — Build an auth latency budget

Given:

API P95 target: 150 ms
Business handler P95: 90 ms
DB P95: 35 ms
Serialization P95: 8 ms

Define max auth P95 budget and decide whether remote token introspection per request is acceptable.

Exercise 2 — Tune password verification capacity

Given:

Argon2id verify P95: 320 ms
Pod CPU limit: 2 vCPU
Expected peak valid login: 30/sec
Attack traffic: 500 invalid/sec

Design:

hash worker count
queue size
rate limiter dimensions
rejection semantics
metrics

Exercise 3 — Unknown kid storm

Simulate JWTs with random kid values.

Verify:

JWKS endpoint is not called for every random kid
requests fail closed
unknown kid metric spikes
business endpoint remains healthy

Exercise 4 — Redis session latency injection

Inject 200 ms Redis latency.

Observe:

session-authenticated endpoint P99
connection pool saturation
thread pool blocking
circuit breaker behavior
alert timing

23. Key takeaways

Authentication performance is not generic optimization.

The core rules:

Separate hot request-path auth from expensive ceremony auth.
Put every expensive auth operation behind a capacity boundary.
Tune password hashing with benchmark, not folklore.
Cache public metadata; cache security decisions only with explicit revocation trade-off.
Never let unknown kid, random token, or fake username traffic create unbounded expensive work.
Measure auth by mechanism, outcome, and latency distribution.
Security invariant beats latency target.

A system that logs in quickly but accepts weak tokens is broken.

A system that hashes passwords securely but collapses under username spray is also broken.

Production-grade auth is the balance: expensive for attackers, bounded for defenders, predictable for users.

References

Spring Security Reference — Password Storage: adaptive one-way functions and work factor tuning.
Spring Security Reference — OAuth2 Resource Server JWT validation.
OWASP Password Storage Cheat Sheet.
OWASP Authentication Cheat Sheet.
OWASP API Security — Lack of Resources and Rate Limiting.
NIST SP 800-63B-4 — Digital Identity Guidelines: Authentication and Authenticator Management.
RFC 6750 — OAuth 2.0 Bearer Token Usage.
RFC 7009 — OAuth 2.0 Token Revocation.
RFC 7662 — OAuth 2.0 Token Introspection.
RFC 8725 — JSON Web Token Best Current Practices.
RFC 9700 — Best Current Practice for OAuth 2.0 Security.

Lesson Recap

You just completed lesson 37 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 36

Authentication Testing Strategy

Next Lesson

Lesson 38

Authentication Operational Runbook

Authentication Performance Engineering

Part 037 — Authentication Performance Engineering

1. Mental model: authentication has two performance classes

2. Performance objective: security first, then predictable latency

3. Latency budget model

4. Capacity model: authentication as a bottleneck

5. Password hashing performance

5.1 Jangan jalankan password hashing sembarangan di request thread pool

5.2 Tune cost with benchmark, not folklore

5.3 Synthetic verification must also be budgeted

5.4 Rehash-on-login jangan menjadi latency surprise

6. JWT validation performance

6.1 JWKS cache strategy

6.2 Token size matters

6.3 Local JWT validation vs opaque introspection

7. Opaque token introspection performance

8. Session store performance

8.1 Session object must be small

8.2 Avoid write amplification

8.3 Session indexing

9. API key lookup performance

10. HMAC signing performance

11. OAuth/OIDC callback performance

12. Rate limiter performance

13. Threading model in Java auth

13.1 Servlet stack

13.2 Reactive stack

14. Caching strategy: what can and cannot be cached

15. Backpressure and graceful degradation

16. Metrics: what to measure

Request-path metrics

Ceremony metrics

Security metrics

17. SLO examples

18. Load testing scenarios

19. Failure mode catalog

19.1 Hash cost too high

19.2 Hash cost too low

19.3 JWKS cache miss storm

19.4 Introspection storm

19.5 Session store hot key

19.6 Audit pipeline backpressure

20. Implementation blueprint: auth performance guardrails

21. Production checklist

Password/auth ceremony

Token/resource server

Session

Operations

22. Exercises

Exercise 1 — Build an auth latency budget

Exercise 2 — Tune password verification capacity

Exercise 3 — Unknown kid storm

Exercise 4 — Redis session latency injection

23. Key takeaways

References