Runtime Contract Enforcement: Gateway, Service, Producer, Consumer, Registry, and Quarantine Controls
Learn Java API Contract Engineering, Event Contract Engineering & Schema Governance - Part 030
Runtime contract enforcement for Java APIs and event-driven systems: gateway validation, service validation, producer enforcement, consumer enforcement, schema registry, DLQ/quarantine, fail-open/fail-closed, and observability.
Part 030 — Runtime Contract Enforcement: Gateway, Service, Producer, Consumer, Registry, and Quarantine Controls
Tujuan Pembelajaran
Static contract governance mencegah banyak masalah sebelum release. Tetapi runtime tetap bisa drift:
- service mengirim response berbeda dari OpenAPI;
- producer publish event yang tidak valid;
- consumer menerima old event yang tidak diuji;
- schema registry unreachable;
- gateway config tidak sesuai contract;
- Kafka topic retention berbeda;
- DLQ payload kehilangan original metadata;
- producer melewati paved-road publisher;
- manual hotfix mengubah behavior;
- consumer strict parser crash karena unknown field.
Runtime contract enforcement adalah lapisan kontrol yang memastikan contract tidak hanya benar di repo, tetapi juga benar di production.
Setelah part ini, kamu harus mampu:
- menentukan enforcement points untuk API and event systems;
- membedakan validation at edge, service, producer, broker/registry, consumer, and DLQ;
- mendesain fail-open/fail-closed strategy;
- menerapkan runtime validation di Java/Spring/Kafka services;
- mengelola performance cost dari validation;
- mengirim invalid messages ke quarantine/DLQ secara aman;
- membuat metrics/logs/traces untuk contract violations;
- melakukan shadow validation and sampling;
- mendeteksi runtime drift;
- menghindari enforcement yang menyebabkan outage lebih besar daripada bug awal.
1. Runtime Enforcement Layers
Enforcement should be layered, not duplicated blindly.
2. Enforcement Goals
Runtime enforcement aims to:
- prevent invalid input from entering system;
- prevent invalid output from leaving service;
- prevent invalid events from entering streams;
- protect consumers from bad events;
- preserve invalid artifacts for debugging;
- detect drift from contract;
- measure contract violation rate;
- provide evidence for incidents;
- support safe rollout/shadow validation;
- avoid silent data corruption.
It does not mean validating everything at every hop forever. Enforcement must be risk-based.
3. Enforcement Points
| Point | What to enforce |
|---|---|
| API gateway | auth, route, request shape, size, rate limit |
| API service inbound | DTO validation, business preconditions, idempotency |
| API service outbound | response schema, error contract, headers |
| Event producer | envelope, payload schema, topic, key, metadata |
| Outbox | event immutability, event ID stability, schemaRef |
| Schema registry/serializer | schema existence, compatibility, serialization |
| Broker config | topic policy, ACL, retention, compaction |
| Event consumer inbound | schema, event type, sequence, duplicate, classification |
| DLQ/quarantine | original preservation, failure metadata |
| Catalog/drift jobs | runtime vs declared contract |
4. Fail-Open vs Fail-Closed
Enforcement decision:
| Strategy | Meaning | Use when |
|---|---|---|
| fail-closed | reject/stop on violation | unsafe input, critical contract |
| fail-open | allow but record violation | observability rollout, low risk |
| shadow | validate but do not affect flow | new validation rule |
| sample | validate percentage | high volume response/event |
| quarantine | isolate invalid message | event consumer/projection |
| degrade | return limited response | non-critical enrichment |
| feature-gated | enforce for selected traffic | rollout |
4.1 Rule
Use fail-closed at trust boundaries for unsafe input.
Use shadow/sampling for outbound validation during adoption.
Use quarantine for event consumers when bad message should not corrupt state.
5. API Gateway Enforcement
Gateway can enforce:
- authentication;
- authorization/scopes;
- request size;
- content type;
- route/method existence;
- basic request schema;
- rate limits;
- idempotency key presence for selected operations;
- correlation/request ID;
- threat protection.
Gateway should not own deep business validation.
5.1 Gateway Benefits
- reject invalid requests before service load;
- consistent error response;
- central rate limit;
- enforce public contract;
- collect traffic metrics.
5.2 Gateway Risks
- gateway schema drifts from service;
- validation behavior differs from service;
- complex business rules pushed to gateway;
- response validation at gateway expensive;
- generated gateway config stale.
Contract source should generate or validate gateway config.
6. Java API Inbound Validation
Typical layers:
6.1 DTO Validation
Example:
public record ApproveCaseRequest(
@NotBlank String reasonCode,
@Size(max = 500) String note
) {}
Controller:
@PostMapping("/cases/{caseId}:approve")
public ResponseEntity<CaseResponse> approve(
@PathVariable String caseId,
@Valid @RequestBody ApproveCaseRequest request
) {
CaseResponse response = caseApplicationService.approve(caseId, request);
return ResponseEntity.ok(response);
}
6.2 Validation Boundary
DTO validation:
- shape;
- required fields;
- format;
- simple constraints.
Application/domain validation:
- case exists;
- user has authority;
- state transition allowed;
- policy active;
- idempotency semantics;
- cross-service reference.
Do not force all business validation into annotations.
7. Error Contract Enforcement
Use one error mapper for stable error shape.
Example Problem model:
public record ApiProblem(
URI type,
String title,
int status,
String detail,
String instance,
String code,
String correlationId,
List<Violation> violations
) {
public record Violation(
String field,
String code,
String message
) {}
}
Controller advice:
@RestControllerAdvice
public class ApiExceptionHandler {
@ExceptionHandler(ValidationException.class)
ResponseEntity<ApiProblem> handleValidation(ValidationException ex) {
ApiProblem problem = problemFactory.validationProblem(ex);
return ResponseEntity.status(422).body(problem);
}
@ExceptionHandler(CaseConflictException.class)
ResponseEntity<ApiProblem> handleConflict(CaseConflictException ex) {
ApiProblem problem = problemFactory.conflictProblem(ex);
return ResponseEntity.status(409).body(problem);
}
}
Enforce:
- stable
code; - correct HTTP status;
- no stack trace leakage;
- correlation ID present;
- retryability if part of contract;
- validation paths normalized.
8. Response Contract Enforcement
Response validation can catch server drift.
Options:
- validate all responses in tests;
- validate sampled production responses;
- validate shadow mode;
- validate only critical/public APIs;
- validate in gateway/proxy;
- validate in service filter.
8.1 Why Not Always Validate Everything?
Costs:
- CPU overhead;
- latency;
- large response validation;
- false positives due to schema mismatch;
- validation library bugs;
- outage risk if fail-closed.
8.2 Recommended
- fail-closed for inbound unsafe input;
- test-time response validation for all APIs;
- shadow/sampled response validation in production for critical/public APIs;
- fail-closed outbound only for highly controlled internal paths where risk justified.
8.3 Response Validation Metric
api.contract.response.validation.failures
labels:
api
operationId
status
violationCode
Avoid high cardinality labels like request ID.
9. API Idempotency Enforcement
For commands with side effects, enforce idempotency key.
@PostMapping("/payments")
public ResponseEntity<PaymentResponse> createPayment(
@RequestHeader("Idempotency-Key") String idempotencyKey,
@Valid @RequestBody CreatePaymentRequest request
) {
return idempotencyService.execute(
idempotencyKey,
request.fingerprint(),
() -> paymentService.create(request)
);
}
Contract should define:
- header name;
- key scope;
- retention;
- same key same request behavior;
- same key different request conflict;
- response replay behavior.
Runtime must enforce it, not only document.
10. Event Producer Enforcement
Producer must validate before publishing.
Checks:
- event envelope present;
- eventId non-empty and stable;
- eventType matches schema/event descriptor;
- source matches service authority;
- occurredAt present;
- schemaRef present;
- aggregateId/key present;
- topic correct;
- key correct;
- schema validation passes;
- data classification present;
- no forbidden sensitive fields.
10.1 Java Event Publisher Wrapper
public final class ContractEnforcedEventPublisher {
private final SchemaValidator schemaValidator;
private final EventPolicyValidator policyValidator;
private final KafkaTemplate<String, Object> kafkaTemplate;
public <T> void publish(EventDescriptor<T> descriptor, EventEnvelope<T> event) {
policyValidator.validate(descriptor, event);
schemaValidator.validate(descriptor.schemaRef(), event);
String key = descriptor.keyExtractor().apply(event);
kafkaTemplate.send(descriptor.topic(), key, event);
}
}
Require teams to use paved-road publisher.
10.2 Prevent Bypass
- static analysis ban direct KafkaTemplate use for domain events;
- code owners on publisher config;
- runtime metrics detect unknown event shape;
- broker ACL limits producer topics;
- CI tests verify published records.
11. Outbox Enforcement
Outbox row should include:
- eventId;
- eventType;
- topic;
- key;
- schemaRef;
- payload;
- metadata;
- occurredAt;
- publish status.
Validation before insert:
outboxValidator.validate(outboxEvent);
outboxRepository.save(outboxEvent);
Publisher should not mutate event identity.
Bad:
event.setEventId(randomId()) // during publish retry
Good:
eventId generated once at domain transaction/outbox creation.
Outbox enforcement protects against invalid events entering publish pipeline.
12. Schema Registry Runtime Enforcement
Producer serializer can enforce schema existence and serialization.
Production guidance:
- auto-register disabled or tightly controlled;
- producer uses schema known from CI;
- schema compatibility checked before release;
- serializer fails if schema missing;
- schema ID included in payload/header according to standard;
- schema cache monitored;
- registry failures handled with clear policy.
12.1 Registry Unavailable
If registry unavailable:
- producer may fail to publish if schema not cached;
- consumer may fail to deserialize unknown schema ID;
- cached schemas may allow continued operation temporarily.
Policy must define:
- cache TTL;
- startup behavior;
- fail-open/fail-closed;
- alerting;
- fallback if any.
For critical systems, registry availability is part of platform SLO.
13. Broker-Level Enforcement
Broker/Kafka controls:
- ACL: only approved producers can write topic;
- ACL: only approved consumers can read sensitive topics;
- topic config policy;
- max message size;
- retention/compaction config;
- quotas;
- schema-aware interceptors if platform supports;
- topic creation policy.
Broker cannot understand all semantics, but can enforce access and operational boundaries.
14. Event Consumer Enforcement
Consumer inbound checks:
- deserialization success;
- event envelope valid;
- event type supported or ignored;
- schemaRef supported;
- data classification allowed;
- duplicate detection;
- sequence/order check;
- tenant/jurisdiction check;
- replay marker handling;
- poison message route.
14.1 Consumer Skeleton
public void onMessage(ConsumerRecord<String, EventEnvelope<JsonNode>> record) {
EventEnvelope<JsonNode> event = record.value();
try {
consumerPolicy.validate(record, event);
if (!supportedEventTypes.contains(event.metadata().eventType())) {
ignoredEventCounter.increment(event.metadata().eventType());
return;
}
if (deduplicator.alreadyProcessed(event.metadata().eventId())) {
duplicateCounter.increment(event.metadata().eventType());
return;
}
EventEnvelope<JsonNode> normalized = upcaster.upcastToLatest(event);
schemaValidator.validate(normalized.metadata().schemaRef(), normalized);
handler.handle(normalized);
deduplicator.markProcessed(event.metadata().eventId());
} catch (RecoverableContractException ex) {
retryStrategy.retry(record, ex);
} catch (ContractViolationException ex) {
quarantinePublisher.publish(record, ex);
}
}
Actual commit transaction must be carefully designed.
15. Quarantine vs DLQ
15.1 DLQ
Usually for failed processing after retries.
15.2 Quarantine
For messages requiring manual/controlled review, often because processing them could corrupt state.
Use quarantine for:
- sequence gap;
- unknown schema for critical projection;
- data classification violation;
- semantic validation failure;
- possible producer bug;
- invalid event from authoritative source.
DLQ/quarantine must preserve original message.
16. DLQ Contract Enforcement
DLQ message should include:
- original topic;
- original partition;
- original offset;
- original key;
- original headers;
- original value/envelope;
- consumer group;
- failure code;
- failure message sanitized;
- failure time;
- attempt count;
- correlationId/eventId.
Java helper:
public final class DlqPublisher {
public void publish(ConsumerRecord<String, ?> original, Exception exception) {
DlqEnvelope dlq = DlqEnvelope.from(original, exception);
dlqValidator.validate(dlq);
kafkaTemplate.send(dlqTopic(original.topic()), original.key(), dlq);
}
}
Do not drop original envelope.
17. Runtime Schema Validation Strategies
17.1 Producer Full Validation
Pros:
- prevents invalid event entering stream;
- early failure;
- easier debugging.
Cons:
- overhead;
- duplicate with serializer;
- may block on validation bug.
Recommended for critical events.
17.2 Consumer Full Validation
Pros:
- protects consumer;
- catches bad producer/drift;
- good for external/untrusted streams.
Cons:
- overhead;
- old event validation complexity;
- might poison on schema evolution if too strict.
17.3 Sampling
Validate 1% or selected event types.
Useful for high-volume trusted internal streams.
17.4 Shadow Validation
Run validation and emit metrics but do not block.
Use for rollout of new rules.
18. Contract Violation Taxonomy
Define stable violation codes.
API:
API_REQUEST_SCHEMA_INVALID
API_RESPONSE_SCHEMA_INVALID
API_ERROR_CONTRACT_INVALID
API_IDEMPOTENCY_KEY_MISSING
API_UNDOCUMENTED_STATUS
Event:
EVENT_SCHEMA_INVALID
EVENT_ENVELOPE_INVALID
EVENT_TYPE_UNSUPPORTED
EVENT_KEY_MISMATCH
EVENT_SEQUENCE_GAP
EVENT_DUPLICATE
EVENT_SOURCE_INVALID
EVENT_CLASSIFICATION_VIOLATION
EVENT_SCHEMA_UNKNOWN
EVENT_REPLAY_UNSAFE
Registry:
SCHEMA_NOT_FOUND
SCHEMA_COMPATIBILITY_REJECTED
SCHEMA_ID_RESOLUTION_FAILED
DLQ:
DLQ_ORIGINAL_CONTEXT_MISSING
DLQ_PUBLISH_FAILED
Stable codes enable metrics, alerting, and incident analysis.
19. Observability for Runtime Enforcement
Metrics:
- validation success/failure;
- contract violation count;
- violation by code;
- event type invalid count;
- DLQ/quarantine count;
- schema ID unknown;
- registry lookup latency/failure;
- response validation failure;
- idempotency conflict;
- replay skipped side effects.
Labels:
service
artifactId
operationId
eventType
topic
consumerGroup
violationCode
environment
Avoid high cardinality:
eventId
customerId
caseId
correlationId
Logs/traces should include high-cardinality IDs, metrics should not.
20. Trace and Correlation Enforcement
At runtime ensure:
- inbound request has or gets request/correlation ID;
- correlation propagates to events;
- causation ID set from command/event;
- trace context flows across async boundary;
- logs include eventId/correlationId;
- DLQ preserves correlation.
Example event creation:
EventMetadata metadata = EventMetadata.builder()
.eventId(eventIds.next())
.correlationId(correlationContext.currentCorrelationId())
.causationId(correlationContext.currentCausationId())
.traceId(traceContext.currentTraceId())
.build();
Without runtime enforcement, observability fields are often missing.
21. Data Classification Enforcement
Runtime checks:
- producer includes classification metadata;
- topic classification compatible with event classification;
- consumer authorized for classification;
- DLQ classification not weaker than source;
- data lake sink allowed;
- sensitive fields not in logs;
- examples/test data do not leak secrets.
Example policy:
if (!topicPolicy.allows(event.metadata().dataClassification())) {
throw new ContractViolationException("EVENT_CLASSIFICATION_VIOLATION");
}
For high-risk domains, data classification enforcement prevents accidental leakage.
22. Runtime Drift Detection
Drift examples:
- gateway route exists but OpenAPI missing;
- service returns field not in OpenAPI;
- producer emits eventType not in AsyncAPI;
- Kafka topic retention differs from contract;
- producer uses schema not in repo;
- consumer group exists but not registered;
- deprecated event still produced after retirement;
- API auth scope differs from spec.
Drift job:
Drift should be categorized by severity.
23. Runtime Enforcement Rollout
Do not turn on strict enforcement everywhere at once.
Rollout:
- observe only;
- shadow validation;
- sample validation;
- warn-only;
- fail-closed for low-risk endpoints/events;
- expand coverage;
- enforce for critical boundaries;
- monitor violation rate;
- tune false positives.
Example feature flag:
contractValidation:
mode: shadow
sampleRate: 0.10
failClosed:
operations:
- createPayment
eventTypes:
- PaymentCaptured
24. Performance Considerations
Validation cost depends on:
- payload size;
- schema complexity;
- format;
- validator implementation;
- reference resolution;
- caching;
- serialization/deserialization;
- sample rate.
Optimization:
- compile/cache schemas;
- avoid network schema fetch per message;
- validate at producer once if trusted;
- sample response validation;
- use binary schema serializers;
- keep schemas sane;
- benchmark critical paths;
- make enforcement configurable.
Do not let validation fetch remote refs synchronously per request.
25. Java Implementation Patterns
25.1 Spring Boot API Validation
- Jackson parse strictness;
- Bean Validation annotations;
- ControllerAdvice;
- request/response filters;
- OpenAPI generated interfaces;
- problem detail mapper;
- idempotency service;
- correlation filter.
25.2 Kafka Producer Enforcement
- event publisher wrapper;
- schema validator;
- topic/key descriptor;
- producer interceptor for headers;
- outbox validator;
- schema registry serializer.
25.3 Kafka Consumer Enforcement
- deserializer with schema registry;
- envelope validator;
- idempotency repository;
- sequence checker;
- upcaster;
- quarantine publisher;
- retry/DLQ strategy;
- metrics.
25.4 Shared Platform Libraries
Put enforcement in libraries, not copy-pasted service code.
26. Testing Runtime Enforcement
Tests:
- invalid request rejected;
- invalid response detected in test/shadow;
- missing idempotency key rejected;
- producer blocks invalid event;
- producer uses correct topic/key;
- consumer deduplicates duplicate event;
- consumer quarantines schema-invalid event;
- DLQ preserves original context;
- registry missing schema fails startup;
- data classification violation blocked;
- drift detector flags mismatch.
Example:
@Test
void producerRejectsEventWithWrongEventType() {
EventEnvelope<CaseApprovedPayload> event =
caseApprovedEventWithType("CaseClosed");
assertThatThrownBy(() -> publisher.publish(CASE_APPROVED_DESCRIPTOR, event))
.isInstanceOf(ContractViolationException.class)
.hasMessageContaining("EVENT_TYPE_MISMATCH");
}
27. Failures Caused by Enforcement
Enforcement can cause incidents if poorly designed.
Examples:
- validator bug rejects valid production traffic;
- registry outage blocks producer;
- response validation fail-closed causes outage;
- consumer quarantines every event due to new optional field;
- DLQ topic unavailable blocks consumer;
- strict JSON parser rejects harmless unknown field;
- schema cache eviction causes latency spike.
Mitigations:
- shadow mode rollout;
- feature flags;
- fail-open for non-critical outbound checks;
- fallback cache;
- alert before block;
- canary enforcement;
- clear rollback.
Enforcement must be reliable software.
28. Runtime Enforcement Policy Template
runtimeContractEnforcement:
api:
inbound:
mode: fail-closed
validation: dto-and-schema
outbound:
mode: shadow
sampleRate: 0.05
errors:
problemDetailsRequired: true
idempotency:
enforceFor:
- createPayment
- submitCase
events:
producer:
mode: fail-closed
requireSchemaValidation: true
requireEnvelopeValidation: true
requireTopicKeyValidation: true
consumer:
schemaValidation: fail-closed-for-critical
unknownEventType: ignore-on-multitype-topic
duplicateHandling: required
sequenceGap: quarantine
dlq:
preserveOriginalEnvelope: true
includeFailureMetadata: true
registry:
autoRegisterInProd: false
startupSchemaCheck: true
observability:
metricsRequired: true
violationCodesRequired: true
rollout:
defaultModeForNewRules: shadow
29. Runtime Enforcement Anti-Patterns
29.1 Static Governance Only
Production drifts.
29.2 Validate Everything Everywhere
Latency/outage risk.
29.3 Fail-Closed Outbound Response Too Early
Turns schema mismatch into outage.
29.4 Runtime Auto-Register Schemas in Prod
Bypasses review.
29.5 Consumer Rejects Unknown Event Type
Multi-type topic evolution breaks.
29.6 DLQ Drops Original Event
Debugging impossible.
29.7 Metrics with Event ID Label
High-cardinality metrics explosion.
29.8 No Feature Flag
Cannot disable bad validator.
29.9 No Shadow Mode
Strict enforcement causes surprise.
29.10 Enforcement Logic Copy-Pasted
Inconsistent behavior across services.
29.11 Registry Required on Every Message
Latency and outage risk.
29.12 Contract Violations Logged but Not Alerted
No operational value.
30. Practice Lab
Lab 1 — API Enforcement
Design runtime enforcement for POST /payments:
- request validation;
- idempotency key;
- error contract;
- response validation;
- metrics.
Lab 2 — Producer Enforcement
Design Java event publisher wrapper for PaymentCaptured.
Validate envelope, schema, topic, key, classification.
Lab 3 — Consumer Enforcement
Consumer ledger-projection consumes PaymentCaptured.
Design duplicate, sequence, schema, DLQ, and replay handling.
Lab 4 — Fail-Open/Fail-Closed
Classify enforcement mode for:
- invalid public API request;
- API response validation failure;
- invalid event produced by payment service;
- unknown event type on multi-type topic;
- registry unavailable;
- DLQ publish failure.
Lab 5 — Drift Detection
Runtime emits eventType not in AsyncAPI. Design drift detector and escalation.
Lab 6 — Observability
Define metrics and alert thresholds for contract violations.
31. Senior Engineer Heuristics
- Static contract checks prevent bad releases; runtime enforcement detects bad reality.
- Validate at trust boundaries.
- Producer validation prevents bad events from becoming shared history.
- Consumer validation protects local state.
- Fail-closed inbound; shadow/sample outbound unless risk justifies.
- Kafka key/topic validation is runtime contract enforcement.
- Schema registry is runtime dependency; design its failure mode.
- DLQ must preserve original context.
- Contract violations need stable codes and metrics.
- Avoid high-cardinality metric labels.
- Use feature flags for enforcement rollout.
- Runtime drift detection closes the governance loop.
- Enforcement should be implemented in paved-road libraries.
- Do not let enforcement create bigger outages than the violation.
- A serious platform treats contract conformance as observable production behavior.
32. Summary
Runtime contract enforcement ensures that APIs, events, schemas, and operational metadata conform to contract in production, not just in repositories. It includes gateway validation, service validation, producer enforcement, outbox checks, schema registry runtime behavior, consumer validation, DLQ/quarantine, drift detection, and observability.
Main takeaways:
- enforce at multiple layers, but risk-base the strictness;
- fail-closed for unsafe inbound boundaries;
- use shadow/sampling for outbound validation rollout;
- producer event validation protects shared event history;
- consumer validation protects projections and side effects;
- schema registry runtime behavior needs caching and failure policy;
- DLQ/quarantine must preserve original event context;
- contract violation taxonomy enables metrics and alerts;
- runtime drift detection finds divergence between declared and actual systems;
- enforcement must be reliable, feature-gated, observable, and implemented through paved-road libraries.
Part berikutnya membahas contract observability and incident response: metrics, traces, logs, violation dashboards, consumer lag, DLQ analysis, schema incidents, compatibility incidents, and postmortem-driven governance improvement.
You just completed lesson 30 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.