Final StretchOrdered learning track

Authentication Observability

Learn Java Authentication Pattern - Part 035

Authentication observability untuk Java systems: audit event, security event, structured logging, metrics, tracing, detection signals, correlation, privacy boundary, Spring Security event publisher, Jakarta filter instrumentation, SIEM-ready schema, dan incident evidence.

10 min read1911 words
PrevNext
Lesson 3540 lesson track34–40 Final Stretch
#java#authentication#observability#audit+11 more

Part 035 — Authentication Observability

Target part ini: membangun authentication observability yang bisa dipakai untuk debugging, audit, threat detection, incident response, compliance evidence, dan architecture review. Fokusnya bukan “menambah log”, tetapi mendesain event model yang defensible, minim kebocoran data, dan bisa menjawab pertanyaan sulit saat sistem diserang.

Authentication yang tidak observable adalah black box.

Saat login gagal, token bocor, refresh token reuse terdeteksi, session diambil alih, MFA disalahgunakan, atau tenant tertukar, pertanyaan pertama dari tim engineering, security, compliance, dan bisnis biasanya bukan:

Apakah endpoint login return 401?

Pertanyaannya lebih berat:

Who tried to authenticate?
From where?
Using which factor?
Against which tenant/client?
Was the credential valid?
Was the account locked?
Was MFA required?
Was MFA passed?
Was the session created?
Was a token issued?
Which token family?
Was the refresh token reused?
Was the request blocked by rate limit?
Was this part of a larger attack pattern?
Which users are affected?
Can we prove what happened?

Kalau event auth tidak dirancang, jawaban akan menjadi spekulasi.

Top 1% engineer tidak melihat logging sebagai dekorasi. Mereka melihatnya sebagai bagian dari control plane.

Authentication control without observability is not controllable.

1. Mental model: observability auth bukan hanya log

Observability auth terdiri dari empat lapisan:

1. Structured event
   Fakta diskrit tentang kejadian authentication.

2. Metrics
   Agregasi kuantitatif untuk trend, alert, SLO, dan anomaly.

3. Traces
   Alur lintas service untuk request authentication/token/session.

4. Audit evidence
   Rekaman yang tahan review: siapa melakukan apa, kapan, terhadap boundary apa, dengan outcome apa.

Jangan campur semuanya ke satu konsep “log”.

Perbedaan utama

OutputTujuanContohRetensiSensitivitas
Application logDebug operasionallogin failed due to locked accountpendek-menengahsedang
Security eventDeteksi serangancredential_stuffing_suspectedmenengah-panjangtinggi
Audit eventBukti aksisession_created, password_changedpanjangtinggi
MetricAgregasilogin failure ratependek-menengahrendah jika label aman
TraceDebug pathauth service -> redis -> db -> idppendeksedang

Invariant penting:

A security-relevant authentication transition must emit a structured event.

Kalau state berubah tetapi tidak ada event, sistem kehilangan bukti.


2. Apa yang wajib observable dalam authentication?

Authentication bukan satu event. Ia adalah rangkaian transition.

Setiap transition penting menghasilkan event.

Event minimum

AreaEvent
Loginauth.attempt.started, auth.login.succeeded, auth.login.failed
Passwordauth.password.changed, auth.password.reset.requested, auth.password.reset.completed
MFAauth.mfa.challenge.required, auth.mfa.challenge.succeeded, auth.mfa.challenge.failed, auth.mfa.factor.enrolled, auth.mfa.factor.removed
Sessionauth.session.created, auth.session.rotated, auth.session.revoked, auth.session.expired
Tokenauth.token.issued, auth.token.refreshed, auth.token.revoked, auth.refresh.reuse_detected
API keyauth.api_key.created, auth.api_key.used, auth.api_key.rotated, auth.api_key.revoked
HMAC/mTLSauth.request_signature.failed, auth.client_cert.accepted, auth.client_cert.rejected
OAuth/OIDCauth.oidc.callback.received, auth.oidc.login.succeeded, auth.oidc.login.failed, auth.federation.mapping.failed
Abuseauth.rate_limited, auth.account.locked, auth.suspicious_activity.detected
Adminauth.admin.session_revoked, auth.admin.factor_reset, auth.admin.account_disabled

Jangan hanya emit event saat sukses. Event gagal justru sering lebih penting.


3. Event auth harus menjawab 9 pertanyaan

Setiap event authentication sebaiknya bisa menjawab:

1. What happened?
2. When did it happen?
3. Who was attempting the action?
4. Which account/subject was affected?
5. Which tenant/client/application boundary?
6. Which authentication method/factor?
7. What was the outcome?
8. Why did the system decide that outcome?
9. How can we correlate this to request/session/token/trace?

Model field minimal:

event_id
occurred_at
event_type
severity
outcome
reason_code
request_id
trace_id
correlation_id
tenant_id
client_id
actor_subject_id
target_account_id
credential_id / authenticator_id / session_id_hash / token_family_id
authentication_method
assurance_level
source_ip_hash or source_ip_prefix
user_agent_hash
device_id_hash
risk_score
risk_reasons

Perhatikan penggunaan *_hash. Auth observability harus membantu investigasi tanpa menjadi database rahasia baru.


4. Jangan bocorkan secret ke observability

Log auth sering menjadi sumber kebocoran kedua setelah application database.

Tidak boleh masuk log:

password
password reset token
one-time password
TOTP secret
raw recovery code
raw session id
raw access token
raw refresh token
raw API key
HMAC shared secret
private key
authorization code
code_verifier
full client secret
full email if policy privacy melarang
full phone number
full precise geolocation jika tidak perlu

Boleh dicatat dengan bentuk aman:

subject_id internal
account_id internal
tenant_id
client_id
key prefix
credential id
factor id
session id hash
token jti hash
token family id
IP prefix or risk-normalized location
user agent hash
reason code

Contoh salah:

log.info("login failed username={} password={}", username, password);

Contoh benar:

log.info("auth.login.failed account_lookup={} tenant={} reason={} request_id={}",
    lookupResult.category(), tenantId, reasonCode, requestId);

Lebih baik lagi: structured event, bukan string bebas.


5. Taxonomy event: naming yang stabil

Gunakan nama event yang stabil dan machine-readable.

Format yang direkomendasikan:

auth.<object>.<verb>

Contoh:

auth.login.succeeded
auth.login.failed
auth.session.created
auth.session.revoked
auth.token.refreshed
auth.refresh.reuse_detected
auth.mfa.challenge.failed
auth.api_key.rotated
auth.rate_limit.exceeded

Jangan gunakan nama terlalu natural-language:

User logged in successfully from browser
Oops login failed
Bad token

Nama event harus bertahan selama bertahun-tahun karena downstream detection rule, dashboard, SIEM parser, dan audit report akan bergantung padanya.


6. Severity auth event

Severity bukan emosi. Severity adalah sinyal routing.

SeverityMaknaContoh
INFOState transition normallogin success, logout
NOTICESecurity-relevant but expectedpassword changed, MFA enrolled
WARNSuspicious or policy deniedrepeated failures, rate limit
ERRORSystem failed to enforce cleanlyIdP validation unavailable
CRITICALActive compromise signalrefresh token reuse, signing key compromise

Jangan beri ERROR untuk semua login gagal. Itu akan membuat alert fatigue.


7. Reason code: bagian terpenting yang sering hilang

User-facing message harus generic.

Invalid credentials.

Internal event harus spesifik.

PASSWORD_MISMATCH
ACCOUNT_NOT_FOUND_SYNTHETIC_PATH
ACCOUNT_LOCKED
ACCOUNT_DISABLED
TENANT_NOT_FOUND
TENANT_MEMBERSHIP_MISSING
MFA_REQUIRED
MFA_CODE_INVALID
MFA_FACTOR_LOCKED
RATE_LIMITED_BY_ACCOUNT
RATE_LIMITED_BY_IP_PREFIX
TOKEN_EXPIRED
TOKEN_AUDIENCE_INVALID
TOKEN_ISSUER_INVALID
TOKEN_SIGNATURE_INVALID
REFRESH_TOKEN_REUSE_DETECTED
SESSION_NOT_FOUND
SESSION_REVOKED
API_KEY_HASH_NOT_FOUND
HMAC_NONCE_REPLAYED
CLIENT_CERT_UNTRUSTED

Reason code membuat sistem bisa:

  • membedakan bug vs attack;
  • membuat dashboard yang tajam;
  • menjalankan detection rule;
  • melakukan incident triage;
  • menjaga generic response ke user tanpa kehilangan insight internal.

Invariant:

Generic outward response, specific internal reason code.

8. Auth event domain model di Java

Mulai dari record sederhana.

package com.acme.auth.observability;

import java.time.Instant;
import java.util.List;
import java.util.Map;
import java.util.UUID;

public record AuthEvent(
    UUID eventId,
    Instant occurredAt,
    String eventType,
    AuthSeverity severity,
    AuthOutcome outcome,
    String reasonCode,

    String tenantId,
    String clientId,
    String actorSubjectId,
    String targetAccountId,

    String requestId,
    String correlationId,
    String traceId,

    String authenticationMethod,
    String assuranceLevel,
    String sessionIdHash,
    String tokenFamilyId,
    String credentialId,
    String authenticatorId,

    String sourceIpHash,
    String sourceIpPrefix,
    String userAgentHash,
    String deviceIdHash,

    Integer riskScore,
    List<String> riskReasons,
    Map<String, String> attributes
) {}

Enums:

public enum AuthSeverity {
    INFO,
    NOTICE,
    WARN,
    ERROR,
    CRITICAL
}

public enum AuthOutcome {
    SUCCESS,
    FAILURE,
    DENIED,
    CHALLENGE_REQUIRED,
    REVOKED,
    EXPIRED,
    DETECTED
}

Publisher boundary:

public interface AuthEventPublisher {
    void publish(AuthEvent event);
}

Jangan biarkan business code memilih sink langsung.

LoginService -> AuthEventPublisher -> Log/Kafka/DB/SIEM

Bukan:

LoginService -> log + kafka + db + metrics + SIEM SDK

9. Structured JSON logging

Untuk production, event auth sebaiknya structured.

public final class JsonAuthEventPublisher implements AuthEventPublisher {
    private static final Logger log = LoggerFactory.getLogger("security.auth");
    private final ObjectMapper objectMapper;

    public JsonAuthEventPublisher(ObjectMapper objectMapper) {
        this.objectMapper = objectMapper;
    }

    @Override
    public void publish(AuthEvent event) {
        try {
            log.info(objectMapper.writeValueAsString(event));
        } catch (JsonProcessingException e) {
            log.error("auth_event_serialization_failed event_type={} event_id={}",
                event.eventType(), event.eventId(), e);
        }
    }
}

Dalam sistem besar, gunakan encoder structured logging native agar tidak double-encode string JSON. Tapi boundary-nya sama: output harus punya field stabil.

Contoh event:

{
  "eventId": "68f52767-9b29-490d-9a0e-02a41c0e2f6b",
  "occurredAt": "2026-07-03T12:00:00Z",
  "eventType": "auth.login.failed",
  "severity": "WARN",
  "outcome": "FAILURE",
  "reasonCode": "PASSWORD_MISMATCH",
  "tenantId": "tenant_123",
  "clientId": "portal-web",
  "targetAccountId": "acct_987",
  "requestId": "req_abc",
  "correlationId": "corr_xyz",
  "authenticationMethod": "PASSWORD",
  "sourceIpPrefix": "203.0.113.0/24",
  "userAgentHash": "ua_3e5b...",
  "riskScore": 42,
  "riskReasons": ["FAILED_PASSWORD", "NEW_DEVICE"]
}

10. Correlation model

Authentication terjadi lintas boundary.

Minimal correlation identifiers:

FieldFungsi
request_idsatu HTTP request
trace_iddistributed trace lintas service
correlation_idbusiness/security journey lintas request
session_id_hashsatu login session tanpa membocorkan session id
token_family_idrefresh token lineage
credential_idcredential yang dipakai
authenticator_idMFA/passkey factor
tenant_idisolation boundary
client_idOAuth/client/application boundary

Rule:

Never log raw session/token identifiers; log stable hashes or internal IDs.

Hash untuk log harus pakai secret pepper berbeda dari password pepper.

public interface SensitiveIdentifierHasher {
    String hashForTelemetry(String rawValue);
}

Contoh:

public final class HmacSha256TelemetryHasher implements SensitiveIdentifierHasher {
    private final SecretKey key;

    public HmacSha256TelemetryHasher(SecretKey key) {
        this.key = key;
    }

    @Override
    public String hashForTelemetry(String rawValue) {
        try {
            Mac mac = Mac.getInstance("HmacSHA256");
            mac.init(key);
            byte[] digest = mac.doFinal(rawValue.getBytes(StandardCharsets.UTF_8));
            return Base64.getUrlEncoder().withoutPadding().encodeToString(digest);
        } catch (GeneralSecurityException e) {
            throw new IllegalStateException("telemetry hash failed", e);
        }
    }
}

11. MDC usage: berguna, tapi jangan berlebihan

MDC cocok untuk request-scoped fields:

public final class RequestCorrelationFilter extends OncePerRequestFilter {
    @Override
    protected void doFilterInternal(
        HttpServletRequest request,
        HttpServletResponse response,
        FilterChain filterChain
    ) throws ServletException, IOException {
        String requestId = Optional.ofNullable(request.getHeader("X-Request-Id"))
            .filter(this::isSafeRequestId)
            .orElseGet(() -> UUID.randomUUID().toString());

        try {
            MDC.put("request_id", requestId);
            MDC.put("path", request.getRequestURI());
            filterChain.doFilter(request, response);
        } finally {
            MDC.clear();
        }
    }

    private boolean isSafeRequestId(String value) {
        return value.length() <= 128 && value.matches("[A-Za-z0-9._:-]+.");
    }
}

Bug kecil di atas: regex +. mengizinkan karakter ekstra karena titik wildcard. Versi benar:

private boolean isSafeRequestId(String value) {
    return value != null
        && value.length() <= 128
        && value.matches("[A-Za-z0-9._:-]+");
}

MDC failure mode:

FailureDampak
tidak dibersihkandata request bocor ke request lain di thread pool
terlalu banyak PIIlog menjadi liability
raw token masuk MDCsemua log downstream bocor token
async boundary tidak propagasitrace/log kehilangan correlation

Invariant:

MDC is for correlation, not for secrets.

12. Spring Security event integration

Spring Security menyediakan event untuk authentication success/failure. Untuk sistem enterprise, event bawaan biasanya perlu dipetakan ke schema internal.

@Component
public final class SpringAuthenticationEventListener {
    private final AuthEventPublisher publisher;
    private final RequestContextExtractor requestContextExtractor;

    public SpringAuthenticationEventListener(
        AuthEventPublisher publisher,
        RequestContextExtractor requestContextExtractor
    ) {
        this.publisher = publisher;
        this.requestContextExtractor = requestContextExtractor;
    }

    @EventListener
    public void onSuccess(AuthenticationSuccessEvent event) {
        RequestContext ctx = requestContextExtractor.current();
        Authentication authentication = event.getAuthentication();

        publisher.publish(AuthEventBuilder.base("auth.login.succeeded")
            .severity(AuthSeverity.INFO)
            .outcome(AuthOutcome.SUCCESS)
            .reasonCode("AUTHENTICATION_SUCCEEDED")
            .tenantId(ctx.tenantId())
            .clientId(ctx.clientId())
            .actorSubjectId(resolveSubject(authentication))
            .requestId(ctx.requestId())
            .traceId(ctx.traceId())
            .authenticationMethod(resolveMethod(authentication))
            .sourceIpPrefix(ctx.sourceIpPrefix())
            .userAgentHash(ctx.userAgentHash())
            .build());
    }

    @EventListener
    public void onFailure(AbstractAuthenticationFailureEvent event) {
        RequestContext ctx = requestContextExtractor.current();

        publisher.publish(AuthEventBuilder.base("auth.login.failed")
            .severity(AuthSeverity.WARN)
            .outcome(AuthOutcome.FAILURE)
            .reasonCode(mapFailure(event.getException()))
            .tenantId(ctx.tenantId())
            .clientId(ctx.clientId())
            .requestId(ctx.requestId())
            .traceId(ctx.traceId())
            .sourceIpPrefix(ctx.sourceIpPrefix())
            .userAgentHash(ctx.userAgentHash())
            .build());
    }

    private String mapFailure(AuthenticationException exception) {
        if (exception instanceof BadCredentialsException) return "BAD_CREDENTIALS";
        if (exception instanceof LockedException) return "ACCOUNT_LOCKED";
        if (exception instanceof DisabledException) return "ACCOUNT_DISABLED";
        if (exception instanceof CredentialsExpiredException) return "CREDENTIAL_EXPIRED";
        return "AUTHENTICATION_FAILED";
    }
}

Jangan bergantung sepenuhnya pada event framework. Banyak transition penting terjadi di luar AuthenticationSuccessEvent:

  • refresh token rotation;
  • session revocation;
  • password reset;
  • MFA enrollment;
  • API key rotation;
  • IdP callback mapping;
  • rate limit block;
  • token introspection failure.

Gunakan explicit domain event untuk transition tersebut.


13. Observability di custom Spring filter

Untuk token/resource server, sering lebih baik instrumentasi di filter boundary.

public final class BearerTokenObservationFilter extends OncePerRequestFilter {
    private final AuthEventPublisher publisher;
    private final SensitiveIdentifierHasher hasher;

    public BearerTokenObservationFilter(
        AuthEventPublisher publisher,
        SensitiveIdentifierHasher hasher
    ) {
        this.publisher = publisher;
        this.hasher = hasher;
    }

    @Override
    protected void doFilterInternal(
        HttpServletRequest request,
        HttpServletResponse response,
        FilterChain chain
    ) throws ServletException, IOException {
        try {
            chain.doFilter(request, response);
        } finally {
            Authentication authentication = SecurityContextHolder.getContext().getAuthentication();

            if (authentication instanceof JwtAuthenticationToken jwtAuth) {
                Jwt jwt = jwtAuth.getToken();

                publisher.publish(AuthEventBuilder.base("auth.token.accepted")
                    .severity(AuthSeverity.INFO)
                    .outcome(AuthOutcome.SUCCESS)
                    .tenantId(extractTenant(jwt))
                    .clientId(jwt.getClaimAsString("azp"))
                    .actorSubjectId(jwt.getSubject())
                    .reasonCode("JWT_ACCEPTED")
                    .attributes(Map.of(
                        "issuer", jwt.getIssuer().toString(),
                        "audience", String.join(",", jwt.getAudience()),
                        "jti_hash", hasher.hashForTelemetry(jwt.getId())
                    ))
                    .build());
            }
        }
    }
}

Tetapi hati-hati: event auth.token.accepted untuk setiap request bisa sangat mahal. Gunakan sampling atau emit metric untuk high-volume path, dan event detail untuk security-relevant transition.

Rule praktis:

Emit detailed event for state-changing auth transitions.
Emit metrics/sampled traces for high-volume repeated validation.

14. Jakarta/JAX-RS filter instrumentation

Untuk non-Spring stack:

@Provider
@Priority(Priorities.AUTHENTICATION)
public final class JwtAuthenticationFilter implements ContainerRequestFilter {
    private final JwtVerifier verifier;
    private final AuthEventPublisher events;

    public JwtAuthenticationFilter(JwtVerifier verifier, AuthEventPublisher events) {
        this.verifier = verifier;
        this.events = events;
    }

    @Override
    public void filter(ContainerRequestContext requestContext) {
        RequestContext ctx = RequestContext.from(requestContext);
        String bearer = extractBearer(requestContext);

        if (bearer == null) {
            events.publish(AuthEventBuilder.base("auth.token.missing")
                .severity(AuthSeverity.INFO)
                .outcome(AuthOutcome.DENIED)
                .reasonCode("BEARER_TOKEN_MISSING")
                .requestId(ctx.requestId())
                .build());
            abortUnauthorized(requestContext);
            return;
        }

        try {
            VerifiedJwt jwt = verifier.verify(bearer);
            requestContext.setSecurityContext(new JwtSecurityContext(jwt));
        } catch (JwtVerificationException ex) {
            events.publish(AuthEventBuilder.base("auth.token.rejected")
                .severity(AuthSeverity.WARN)
                .outcome(AuthOutcome.DENIED)
                .reasonCode(ex.reasonCode())
                .requestId(ctx.requestId())
                .tenantId(ctx.tenantId())
                .build());
            abortUnauthorized(requestContext);
        }
    }
}

JAX-RS filter memberi boundary bagus untuk authentication, tetapi event domain tetap harus konsisten dengan stack lain.


15. Metrics: yang dihitung harus aman

Metric auth membantu alerting dan capacity planning.

Contoh metric:

auth_login_attempt_total{outcome, method, tenant_tier}
auth_login_failure_total{reason_code, method}
auth_rate_limited_total{dimension}
auth_mfa_challenge_total{factor_type, outcome}
auth_token_refresh_total{outcome}
auth_refresh_reuse_detected_total
auth_session_active_count{tenant_tier}
auth_password_hash_duration_seconds{algorithm}
auth_jwt_validation_duration_seconds{issuer}
auth_idp_callback_duration_seconds{provider}

Jangan gunakan label high-cardinality:

user_id
email
session_id
ip_address
raw_client_id jika jumlahnya sangat besar dan tidak terkendali

High-cardinality label membuat metrics backend mahal dan sering rusak.

Contoh Micrometer:

public final class AuthMetrics {
    private final MeterRegistry registry;

    public AuthMetrics(MeterRegistry registry) {
        this.registry = registry;
    }

    public void recordLoginAttempt(String method, String outcome, String reasonCode) {
        Counter.builder("auth.login.attempt")
            .tag("method", safe(method))
            .tag("outcome", safe(outcome))
            .tag("reason", safeReason(reasonCode))
            .register(registry)
            .increment();
    }

    public <T> T timePasswordHash(String algorithm, Supplier<T> action) {
        return Timer.builder("auth.password.hash.duration")
            .tag("algorithm", safe(algorithm))
            .publishPercentileHistogram()
            .register(registry)
            .record(action);
    }
}

Metric invariant:

Metrics labels must be low-cardinality and non-secret.

16. Dashboard auth yang benar-benar berguna

Minimal dashboard:

Login health

login attempts per minute
success rate
failure rate by reason
p95/p99 login latency
password hash duration
account lookup latency
session creation latency

Abuse

rate-limited requests
top failure reason trend
password spray indicator
credential stuffing indicator
MFA failure spike
refresh token reuse detection
API key failure spike
HMAC nonce replay spike

Token/session

active sessions
session revocation rate
refresh token rotation failure
expired token rejection
audience mismatch rejection
issuer mismatch rejection
JWKS refresh failures

Federation

OIDC callback failures
state mismatch
nonce mismatch
IdP latency
JIT provisioning failure
SAML assertion validation failure

Tenant

login success/failure by tenant tier
tenant resolution failures
cross-tenant token rejection
membership missing failures

Jangan buat dashboard yang hanya indah. Dashboard harus membantu triage.


17. Detection signals

Security detection auth umumnya berasal dari kombinasi event, bukan satu event.

Credential stuffing

Many distinct account identifiers
from same IP prefix/device/user agent family
with password mismatch
within short window

Pseudo rule:

SELECT source_ip_prefix, COUNT(DISTINCT target_account_id) AS accounts, COUNT(*) AS failures
FROM auth_events
WHERE event_type = 'auth.login.failed'
  AND reason_code IN ('PASSWORD_MISMATCH', 'BAD_CREDENTIALS')
  AND occurred_at > now() - interval '10 minutes'
GROUP BY source_ip_prefix
HAVING COUNT(DISTINCT target_account_id) > 50
   AND COUNT(*) > 200;

Password spraying

Same password campaign cannot be logged directly,
but pattern is many accounts, low frequency each,
same IP/device/user agent, generic bad credentials.

Account takeover suspicion

Successful login from new device
after many failed attempts
followed by password/MFA/recovery changes.

Refresh token theft

refresh token reuse detected
or same token family used from incompatible device/location.

MFA fatigue

many push challenges
few approvals
then sudden success
from suspicious source.

Tenant confusion attempt

token issuer valid
but tenant route mismatch or membership missing.

Detection model:


18. Audit table design

Untuk high-value systems, jangan bergantung hanya pada log pipeline. Simpan audit event penting di database atau append-only event store.

CREATE TABLE auth_audit_event (
    event_id UUID PRIMARY KEY,
    occurred_at TIMESTAMPTZ NOT NULL,
    event_type TEXT NOT NULL,
    severity TEXT NOT NULL,
    outcome TEXT NOT NULL,
    reason_code TEXT NOT NULL,

    tenant_id TEXT,
    client_id TEXT,
    actor_subject_id TEXT,
    target_account_id TEXT,

    request_id TEXT,
    correlation_id TEXT,
    trace_id TEXT,

    authentication_method TEXT,
    assurance_level TEXT,
    session_id_hash TEXT,
    token_family_id TEXT,
    credential_id TEXT,
    authenticator_id TEXT,

    source_ip_hash TEXT,
    source_ip_prefix TEXT,
    user_agent_hash TEXT,
    device_id_hash TEXT,

    risk_score INTEGER,
    risk_reasons JSONB NOT NULL DEFAULT '[]'::jsonb,
    attributes JSONB NOT NULL DEFAULT '{}'::jsonb
);

CREATE INDEX idx_auth_audit_event_time
    ON auth_audit_event (occurred_at DESC);

CREATE INDEX idx_auth_audit_event_account_time
    ON auth_audit_event (target_account_id, occurred_at DESC);

CREATE INDEX idx_auth_audit_event_tenant_time
    ON auth_audit_event (tenant_id, occurred_at DESC);

CREATE INDEX idx_auth_audit_event_type_time
    ON auth_audit_event (event_type, occurred_at DESC);

CREATE INDEX idx_auth_audit_event_reason_time
    ON auth_audit_event (reason_code, occurred_at DESC);

Partition by time untuk volume besar:

CREATE TABLE auth_audit_event_2026_07
    PARTITION OF auth_audit_event
    FOR VALUES FROM ('2026-07-01') TO ('2026-08-01');

Retention harus jelas:

EventRetention contoh
low-risk login success30-90 hari
login failure aggregate90-180 hari
password/MFA change1-7 tahun tergantung regulasi
token/session revoke180 hari-1 tahun
admin security action1-7 tahun
compromise evidencelegal hold / incident policy

19. Audit event reliability

Pertanyaan sulit:

Should authentication fail if audit event cannot be written?

Jawabannya tergantung event.

EventJika audit gagal
login failedjangan matikan login global, fallback log lokal
login successidealnya emit; kalau sink down gunakan buffer/outbox
password changedsebaiknya transactionally recorded
MFA removedsebaiknya transactionally recorded
admin session revokesebaiknya transactionally recorded
key compromise actionwajib durable atau fail closed sesuai policy

Pattern aman:

State change transaction -> audit outbox row -> async publisher -> SIEM/log sink

Outbox schema:

CREATE TABLE auth_audit_outbox (
    outbox_id UUID PRIMARY KEY,
    event_id UUID NOT NULL,
    occurred_at TIMESTAMPTZ NOT NULL,
    event_type TEXT NOT NULL,
    payload JSONB NOT NULL,
    published_at TIMESTAMPTZ,
    publish_attempts INTEGER NOT NULL DEFAULT 0,
    next_attempt_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

20. Privacy and data minimization

Auth observability mudah berubah menjadi surveillance system.

Prinsip:

Collect enough to secure and audit.
Do not collect because it is interesting.

Data minimization examples:

DataAlternatif
full IPprefix + hash
full emailaccount id + normalized domain if needed
raw user agentparsed family + hash
precise geolocationcountry/region risk category
raw device fingerprintinternal device id/hash
full token claim dumpallowlisted claim subset

Token claim dump sangat berbahaya karena dapat berisi email, group, role, tenant, entitlement, dan identifier eksternal.

Gunakan allowlist:

private static final Set<String> LOGGABLE_CLAIMS = Set.of(
    "iss", "aud", "azp", "typ", "jti", "tenant_id", "auth_time", "acr", "amr"
);

Bukan:

log.info("jwt claims={}", jwt.getClaims());

21. Tracing authentication

Tracing membantu menjawab latency dan dependency path.

Contoh span:

POST /login
  auth.resolve_tenant
  auth.rate_limit.check
  auth.account.lookup
  auth.password.verify
  auth.mfa.evaluate
  auth.session.create
  auth.audit.publish

Tetapi trace attribute juga tidak boleh berisi secret.

Attribute aman:

auth.method=password
auth.outcome=failure
auth.reason_code=PASSWORD_MISMATCH
auth.tenant_tier=enterprise
auth.client_type=web
auth.assurance_level=aal2

Attribute berbahaya:

auth.password=...
auth.token=...
auth.email=...
auth.session_id=...

Dengan OpenTelemetry Java API:

Span span = tracer.spanBuilder("auth.password.verify").startSpan();
try (Scope ignored = span.makeCurrent()) {
    span.setAttribute("auth.method", "password");
    span.setAttribute("auth.algorithm", passwordHash.algorithm());
    boolean matched = passwordVerifier.verify(rawPassword, passwordHash);
    span.setAttribute("auth.outcome", matched ? "success" : "failure");
    return matched;
} catch (RuntimeException ex) {
    span.recordException(ex);
    span.setStatus(StatusCode.ERROR);
    throw ex;
} finally {
    span.end();
}

22. Alerting: jangan alert semua hal

Alert harus actionable.

Bad alert:

Login failed.

Good alert:

Refresh token reuse detected for token family with active sessions.
Action: revoke token family, mark account risk, require re-auth.

Alert candidates:

SignalSeverityAutomated response
refresh token reusecriticalrevoke family, require login
high credential stuffingwarn/criticalthrottle IP/device/prefix
signing key validation failure spikecriticalfreeze token issuance, inspect JWKS
admin MFA removed for many userscriticalsuspend admin session
unusual tenant mismatch spikewarninspect routing/config
IdP callback state mismatch spikewarn/criticalcheck OAuth attack or cookie issue
password reset completion spikewarninspect campaign

Alert payload harus memuat:

what happened
affected tenant/client/account count
time window
reason codes
sample correlation ids
suggested runbook

23. Observability for incident response

Saat token leak terjadi, Anda perlu query cepat:

-- Which sessions did this account create recently?
SELECT occurred_at, session_id_hash, source_ip_prefix, user_agent_hash, risk_score
FROM auth_audit_event
WHERE target_account_id = 'acct_123'
  AND event_type = 'auth.session.created'
  AND occurred_at > now() - interval '7 days'
ORDER BY occurred_at DESC;
-- Was refresh token reuse detected?
SELECT *
FROM auth_audit_event
WHERE token_family_id = 'rtf_abc'
  AND event_type = 'auth.refresh.reuse_detected';
-- Which accounts were targeted by same IP prefix?
SELECT target_account_id, count(*)
FROM auth_audit_event
WHERE source_ip_prefix = '203.0.113.0/24'
  AND event_type = 'auth.login.failed'
  AND occurred_at > now() - interval '1 hour'
GROUP BY target_account_id
ORDER BY count(*) DESC;
-- Which admin changed MFA factors?
SELECT occurred_at, actor_subject_id, target_account_id, reason_code, attributes
FROM auth_audit_event
WHERE event_type IN ('auth.mfa.factor.removed', 'auth.admin.factor_reset')
  AND occurred_at > now() - interval '30 days';

Observability yang bagus membuat incident response menjadi proses, bukan panik.


24. Failure modes

Failure modeRoot causeDampakMitigasi
Raw token loggeddebug logging carelesscredential leakredaction, allowlist, tests
Password loggedrequest dump middlewarecatastrophicdisable body logging for auth routes
No reason codegeneric events onlypoor triageinternal reason taxonomy
High-cardinality metricsuser/session labelmetrics outage/costcardinality review
Event missing on failureonly success loggedattack invisiblelog success and failure transition
Audit sink down loses eventsdirect async fire-and-forgetevidence gapoutbox/buffer/fallback
MDC not clearedthread reuseprivacy leak/wrong correlationfinally clear
Token claim dumplog all claimsPII leakclaim allowlist
Too many alertslow-quality rulesalert fatigueactionable thresholds/runbooks
Tenant missing in eventnon-tenant-aware logginginvestigation impossibletenant required for tenant-owned action
Observability becomes auth dependencysynchronous SIEM calllogin outageasync outbox except critical control

25. Review checklist

Gunakan checklist ini saat review PR authentication:

[ ] Does every auth state transition emit a structured event?
[ ] Are outward messages generic but internal reason codes specific?
[ ] Are raw credentials/tokens/session ids never logged?
[ ] Are token claims allowlisted before logging?
[ ] Are event names stable and documented?
[ ] Are event IDs and timestamps generated server-side?
[ ] Are request_id, trace_id, tenant_id, client_id included where applicable?
[ ] Are metrics labels low-cardinality?
[ ] Are high-value audit events stored durably?
[ ] Does audit use outbox or reliable publication?
[ ] Are retention policies defined by event type?
[ ] Are events useful for incident response queries?
[ ] Are rate-limit, lockout, MFA, reset, token, and admin actions observable?
[ ] Are MDC and ThreadLocal values cleared after request?
[ ] Are alert rules actionable and tied to runbooks?

26. Production drills

Drill 1 — Credential stuffing

Given:

10,000 login failures in 10 minutes
8,000 distinct account identifiers
50 IP prefixes
few success events

Task:

Design detection query.
Choose automated throttle dimensions.
Decide whether to lock accounts.
Explain why account lockout may help attacker.

Expected direction:

Throttle source/IP/device/client dimensions first.
Avoid mass account lockout.
Raise risk score.
Require step-up for suspicious successful login.

Drill 2 — Refresh token reuse

Given:

refresh token family rtf_123 reused from new device
legitimate user still active

Task:

Which events are emitted?
Which sessions/tokens are revoked?
Which alert fires?
What user-facing action is required?

Expected direction:

auth.refresh.reuse_detected
auth.token.family_revoked
auth.session.revoked
force reauthentication and possibly step-up

Drill 3 — Audit sink unavailable

Given:

SIEM endpoint down for 1 hour
login traffic normal
password changes continue

Task:

Which events can be buffered?
Which events must be durable in DB?
What dashboard shows backlog?
When should auth degrade or fail closed?

27. Final mental model

Authentication observability is not about printing more text.

It is about preserving the evidence of security decisions.

A good auth system can say:

This subject attempted this authentication method,
against this tenant and client,
from this risk context,
with this outcome and reason,
causing this session/token/factor transition,
correlated to this request/trace,
recorded without leaking secrets.

That sentence is the bar.

If your system cannot reconstruct it, you do not have production-grade authentication yet.


References

Lesson Recap

You just completed lesson 35 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.