Series/Learn Java Microservices Communication

Build CoreOrdered learning track

Client Configuration Model

Learn Java Microservices Communication - Part 025

Production-grade configuration model for Java microservice HTTP clients, covering timeout, retry, circuit breaker, bulkhead, pool, rate limit, TLS, payload, and per-operation policies.

[2026-07-05]18 min read3447 words

In This Lesson

1. The Core Rule 2. Configuration Is Not One Layer 3. The Minimum Client Policy

PrevNext

Lesson 2596 lesson track18–52 Build Core

#java#microservices#http-client#configuration+8 more

Part 025 — Client Configuration Model: Timeouts, Pool, Retry, Circuit Breaker

A client without explicit configuration is not a production client.

It is a bet.

The bet usually sounds like this:

The library defaults are probably fine.

That bet fails because HTTP client defaults are designed to be broadly usable, not specifically safe for your latency budget, downstream capacity, retry semantics, audit requirements, or incident profile.

A production-grade Java microservice client needs a configuration model, not scattered properties.

The goal is not to add every knob.

The goal is to make communication behavior deliberate:

How long can this operation wait?
How many concurrent calls can it consume?
Is retry safe?
Is the result allowed to be stale?
How is overload handled?
Which failures open the circuit?
Which status codes are domain outcomes?
Which metrics and traces will identify the dependency?
Which changes can be made safely at runtime?

This part gives you a practical model.

1. The Core Rule

Treat every outbound dependency as a resource with a policy.

Not as a URL.

Not as a generated client.

Not as a Spring bean.

Not as an interface.

A dependency is a resource that consumes:

caller request time
caller threads
caller heap
caller sockets
downstream capacity
network bandwidth
retry budget
observability cardinality
operational attention

So the configuration unit should be:

client dependency + operation + communication policy

Example:

clients:
  customer-service:
    base-url: http://customer-service.default.svc.cluster.local
    default-policy: internal-read
    operations:
      getCustomer:
        method: GET
        path-template: /customers/{customerId}
        policy: critical-read
      suspendCustomer:
        method: POST
        path-template: /customers/{customerId}:suspend
        policy: idempotent-command

The policy is the key.

The URL is just an address.

2. Configuration Is Not One Layer

A common mistake is to configure the client at one level only:

connect-timeout: 500ms
read-timeout: 2s
retry: 3

That looks simple, but it hides four different scopes.

A strong model separates them:

Scope	Purpose	Example
Platform default	Maximum guardrail	No HTTP request may run longer than 10s
Client default	Dependency baseline	Customer service default timeout 800ms
Operation policy	Semantic behavior	`getCustomer` may retry, `createPayment` may not unless idempotent
Request override	Contextual constraint	Parent deadline has only 120ms remaining

The deeper the scope, the more specific it becomes.

But the deeper scope must never violate the safety ceiling.

Bad:

platform:
  max-total-timeout: 5s
clients:
  report-service:
    operations:
      generateReport:
        timeout: 60s

If one operation needs 60 seconds, it is probably not a normal service-to-service HTTP operation. It may need async job submission, polling, callback, or streaming.

3. The Minimum Client Policy

A production outbound client policy should at least define:

policy:
  timeout:
    connect: 150ms
    request: 700ms
    pool-acquire: 50ms
  retry:
    enabled: true
    max-attempts: 2
    backoff:
      initial: 50ms
      max: 150ms
      jitter: true
    retry-on:
      status: [502, 503, 504]
      exceptions: [connect-timeout, connection-reset]
  circuit-breaker:
    enabled: true
    failure-rate-threshold: 50
    slow-call-rate-threshold: 50
    slow-call-duration-threshold: 500ms
    minimum-calls: 50
  bulkhead:
    max-concurrent-calls: 40
    max-wait: 0ms
  rate-limit:
    enabled: false
  payload:
    max-request-bytes: 262144
    max-response-bytes: 1048576
    compression: response-only
  observability:
    dependency: customer-service
    operation: getCustomer
    slo: 300ms

This is not all configuration possible.

It is the minimum operational contract.

4. Timeout Configuration

Timeouts are the most important client configuration.

They define how much failure the caller is willing to wait through.

4.1 Timeout Types

Do not use one number for every timeout.

Timeout	Meaning	Failure protected
Connect timeout	Time to establish connection	Bad route, blocked SYN, unreachable host
Pool acquire timeout	Time waiting for a reusable connection/thread/permit	Local saturation
Request timeout	Total time waiting for operation result	Slow downstream or network
Read/write timeout	Idle socket phase timeout where supported	Stalled transfer
Deadline	Absolute remaining parent budget	Cascading latency

In JDK HttpClient, the client builder exposes connectTimeout, and individual HttpRequest instances can set request timeout. Other clients such as Apache HttpClient, OkHttp, Reactor Netty, and underlying Spring request factories may expose additional pool/read/write controls.

The exact knobs differ.

The policy model should not.

4.2 Deadline Beats Static Timeout

Static timeout says:

This operation may take 700ms.

Deadline says:

This request has 240ms remaining.

When both exist, use the smaller budget.

Duration configuredTimeout = Duration.ofMillis(700);
Duration remainingDeadline = deadline.remaining();
Duration effectiveTimeout = min(configuredTimeout, remainingDeadline.minusMillis(20));

Why subtract a margin?

Because the caller still needs time to map errors, release resources, write logs, finish response handling, and avoid violating its own upstream deadline.

A simple rule:

effective timeout = min(operation timeout, remaining parent deadline - safety margin)

4.3 Timeout Budget Example

Suppose an API has 1 second server-side SLO.

If every downstream call independently uses 1 second timeout, the top-level SLO is fake.

Timeouts must be composed.

4.4 Bad Timeout Values

These are suspicious:

Value	Problem
Infinite	Caller can hang forever
30s default everywhere	Cascading failure amplifier
Same timeout for all dependencies	Ignores dependency semantics
Timeout greater than upstream gateway timeout	Wasted work after caller gave up
Retry timeout not included in total budget	Retry storm risk

4.5 Timeout Implementation Shape

public record TimeoutPolicy(
        Duration connectTimeout,
        Duration poolAcquireTimeout,
        Duration requestTimeout,
        Duration safetyMargin
) {
    public Duration effectiveRequestTimeout(Deadline deadline) {
        Duration remaining = deadline.remaining().minus(safetyMargin);
        if (remaining.isNegative() || remaining.isZero()) {
            throw new DeadlineExceededBeforeCallException();
        }
        return remaining.compareTo(requestTimeout) < 0 ? remaining : requestTimeout;
    }
}

A client should calculate effective timeout per request, not once at startup.

5. Retry Configuration

Retry is not a generic reliability feature.

Retry is a load multiplier.

If a service receives 10,000 requests per second and every caller retries twice during an outage, the downstream may experience 30,000 attempts per second exactly when it is least able to handle them.

5.1 Retry Eligibility

Retry needs three answers:

Question	Example
Is the operation semantically retryable?	GET usually yes; POST command only with idempotency key
Is the failure retryable?	`503` maybe; `400` no
Is there budget left?	No retry if deadline nearly expired

Do not configure retries only by status code.

Configure retries by operation semantics.

operations:
  getCustomer:
    retry-profile: safe-read
  createPayment:
    retry-profile: idempotent-command-only
  submitAuditRecord:
    retry-profile: no-http-retry

5.2 Retry Profiles

retry-profiles:
  safe-read:
    max-attempts: 2
    retry-on-status: [502, 503, 504]
    retry-on-exceptions: [connect-timeout, connection-reset]
    backoff: exponential-jitter

  idempotent-command-only:
    max-attempts: 2
    requires-idempotency-key: true
    retry-on-status: [409-retryable, 502, 503, 504]
    backoff: bounded-jitter

  no-http-retry:
    max-attempts: 1

5.3 Backoff and Jitter

Backoff without jitter synchronizes clients.

Jitter breaks synchronization.

static Duration jitteredBackoff(Duration base, double jitterRatio) {
    double min = 1.0 - jitterRatio;
    double max = 1.0 + jitterRatio;
    double factor = min + Math.random() * (max - min);
    return Duration.ofMillis((long) (base.toMillis() * factor));
}

For production code, prefer a tested library or centralized utility.

The important part is the invariant:

retry delay must not push the call beyond its deadline

5.4 Retry Budget

A retry budget caps retry amplification.

Instead of allowing every request to retry, allow only a percentage of total calls to become retries.

Example policy:

retry-budget:
  max-retry-ratio: 0.20
  window: 30s

Meaning:

At most 20 retry attempts per 100 original calls in a 30-second window.

This is especially useful for high-throughput internal clients.

5.5 Retry Decision Function

public boolean shouldRetry(ClientAttemptResult result, AttemptContext ctx) {
    if (ctx.attemptNumber() >= ctx.policy().maxAttempts()) return false;
    if (!ctx.deadline().hasEnoughTimeForAnotherAttempt()) return false;
    if (!ctx.retryBudget().tryAcquireRetryPermit()) return false;
    if (!ctx.operation().isSemanticallyRetryable()) return false;
    return ctx.policy().retryClassifier().isRetryable(result);
}

This shape is better than scattering retry annotations across clients.

6. Circuit Breaker Configuration

A circuit breaker protects the caller and downstream from repeated failed attempts.

It is not a retry replacement.

It is a stop mechanism.

6.1 What Should Count as Failure?

Not every non-2xx response should count.

Outcome	Count as circuit failure?	Reason
`400 Bad Request`	Usually no	Caller bug, not downstream availability
`401/403`	Usually no	Auth/config problem, not capacity failure
`404` domain not found	No	Valid domain outcome
`408/504`	Yes	Timeout path
`429`	Usually yes or special overload class	Downstream throttling
`500/502/503`	Yes	Server/dependency failure
connection refused/reset	Yes	Transport failure
response too large	Usually no for downstream health, yes for operation failure	Contract/payload policy issue

A circuit breaker must classify failures according to dependency semantics.

6.2 Slow Calls Matter

A downstream can be “successful” and still dangerous.

If every call returns 200 after 2 seconds, the caller is still dying.

Configure slow-call thresholds:

circuit-breaker:
  sliding-window-size: 100
  minimum-number-of-calls: 50
  failure-rate-threshold: 50
  slow-call-duration-threshold: 500ms
  slow-call-rate-threshold: 60
  wait-duration-in-open-state: 10s
  permitted-calls-in-half-open-state: 5

6.3 Per Dependency, Not Per Host Instance

Usually the breaker belongs to a logical dependency operation:

customer-service.getCustomer
payment-service.authorizePayment

Not:

10.42.1.17:8080
10.42.1.18:8080

Instance-level failure handling is usually the job of load balancer, service mesh, endpoint discovery, or connection pool health.

Operation-level circuit breaking answers:

Is this dependency operation safe to keep calling from this service?

7. Bulkhead Configuration

Bulkheads prevent one dependency from consuming all local resources.

A timeout limits duration.

A circuit breaker limits repeated failure.

A bulkhead limits concurrency.

If notification service is slow, it should not consume all threads needed to authorize payments.

7.1 Semaphore vs Thread-Pool Bulkhead

Bulkhead type	Use when	Risk
Semaphore	Caller thread already appropriate; mostly non-blocking or bounded blocking	Caller thread waits if max wait > 0
Thread pool	Need separate execution resource	Queueing, context propagation complexity, more tuning

In modern Java services, start with explicit concurrency limits and avoid unnecessary thread-pool hopping unless there is a clear reason.

7.2 Max Wait

For service-to-service calls, prefer:

bulkhead:
  max-wait: 0ms

Fail fast when concurrency is exhausted.

Waiting inside a bulkhead queue often hides overload until latency explodes.

If queueing is required, make it very small and observable.

8. Connection Pool Configuration

The connection pool is part of the bulkhead.

It limits how much socket-level concurrency the caller can create toward a dependency.

8.1 Pool Properties

Depending on the HTTP client implementation, you may configure:

max connections total
max connections per route/host
idle connection timeout
connection time-to-live
pending acquisition timeout
HTTP/2 max concurrent streams
TLS session reuse
DNS refresh behavior

The JDK HttpClient exposes fewer direct pool controls than Apache HttpClient, OkHttp, or Reactor Netty. That does not mean pool behavior is irrelevant. It means your policy model may need either implementation-specific adapters or client choice based on required controls.

8.2 Pool Sizing Mental Model

Approximate concurrency needed:

required concurrent calls ≈ request rate × average latency

If a client sends 500 requests/second to a dependency and average latency is 40ms:

500 × 0.040 = 20 concurrent calls

Then add headroom, but not infinity.

If p95 is 150ms:

500 × 0.150 = 75 concurrent calls at p95 pressure

A pool of 500 might hide downstream slowdown and amplify load.

A pool of 10 might throttle normal traffic.

A reasonable first cut might be 50–100 depending on SLO, instances, downstream capacity, and operation criticality.

The number is not universal.

The method matters.

8.3 Pool Exhaustion Is a Signal

If pool acquisition fails, do not map it to generic downstream timeout.

It means the caller is locally saturated for that dependency.

Use distinct error classification:

CLIENT_POOL_EXHAUSTED

This helps incident response.

A downstream may be fine while caller pool is misconfigured.

9. Rate Limiter Configuration

Rate limiting can exist at multiple levels:

gateway rate limit
service mesh rate limit
downstream server rate limit
client-side rate limit
per-tenant/business quota

Client-side rate limiting is useful when the caller knows it must not exceed a downstream contract.

Example:

rate-limit:
  limit-for-period: 500
  refresh-period: 1s
  timeout-duration: 0ms

The timeout-duration should usually be zero for service-to-service calls.

If the permit is not available, fail fast or degrade.

Waiting inside a rate limiter is another hidden queue.

10. Payload Configuration

Payload policy belongs in the client configuration.

Not only in JSON mapper code.

payload:
  max-request-bytes: 262144
  max-response-bytes: 1048576
  compression:
    request: false
    response: true
  json:
    fail-on-unknown-properties: false
    fail-on-null-for-primitives: true

10.1 Why Payload Limits Matter

An internal API can still return accidentally huge data:

missing pagination
wrong filter
expanded graph
recursive serialization
accidental debug field
binary payload encoded in JSON
gzip bomb-like compressed response

Payload size is a reliability concern.

10.2 Response Body Strategy

For small JSON:

HttpResponse.BodyHandlers.ofString(StandardCharsets.UTF_8)

For large response:

stream to file/object store
parse incrementally
use pagination
reject if too large

Never let “internal call” mean “unbounded body”.

11. Redirect, Proxy, TLS, and Security Configuration

Even internal clients need explicit transport security decisions.

Setting	Production guidance
Redirects	Usually disabled for service-to-service unless explicitly needed
Proxy	Explicit; avoid accidental environment proxy behavior
TLS trust	Use controlled trust store; avoid trust-all clients
Hostname verification	Do not disable in production
mTLS	Usually platform/mesh-managed or explicitly client-managed
Credentials	Inject from secure config; never hardcode
Header propagation	Allowlist, not copy-all

Do not let a generic client follow redirects across trust boundaries.

A redirect can turn an internal call into an unintended external call if misconfigured.

12. DNS and Discovery Configuration

DNS is not just name lookup.

It affects availability and load distribution.

Important questions:

Does the client cache DNS?
How long?
Does it respect TTL?
Does the service mesh intercept DNS or route at proxy level?
Does the client reuse a connection to one resolved endpoint for too long?
How does it react when endpoints are removed?

A client policy should include discovery assumptions:

discovery:
  mode: kubernetes-dns
  expects-mesh-routing: true
  connection-reuse-policy: mesh-friendly

The application client does not always need to implement discovery logic.

But it must be aware of the platform doing discovery for it.

13. Error Mapper Configuration

Error mapping is configuration plus code.

A client should expose domain-safe errors to business logic.

sealed interface CustomerLookupResult permits CustomerFound, CustomerMissing, CustomerLookupUnavailable {}

record CustomerFound(CustomerSnapshot customer) implements CustomerLookupResult {}
record CustomerMissing(CustomerId customerId) implements CustomerLookupResult {}
record CustomerLookupUnavailable(DependencyFailure failure) implements CustomerLookupResult {}

Mapping table:

HTTP outcome	Client result
`200`	`CustomerFound`
`404`	`CustomerMissing` if the operation semantics define this as domain absence
`409`	domain conflict or retryable conflict depending on body type
`429`	overload/throttled dependency failure
`500/502/503/504`	dependency unavailable
timeout	unknown outcome / dependency timeout
pool exhausted	local dependency saturation

This mapping must be consistent with retry and circuit breaker classification.

Do not have retry treat 404 as final while business logic treats it as exceptional in some call paths and valid in others.

14. Central Policy Registry

Do not duplicate raw values across clients.

Create named profiles.

communication-policies:
  profiles:
    low-latency-read:
      timeout:
        connect: 100ms
        request: 300ms
        pool-acquire: 20ms
      retry:
        max-attempts: 2
        backoff: 30ms-jitter
      bulkhead:
        max-concurrent-calls: 80

    critical-command:
      timeout:
        connect: 150ms
        request: 900ms
        pool-acquire: 50ms
      retry:
        max-attempts: 1
      bulkhead:
        max-concurrent-calls: 30

    async-enqueue:
      timeout:
        connect: 100ms
        request: 250ms
      retry:
        max-attempts: 2
        requires-idempotency-key: true

Then bind operations to profiles:

clients:
  customer-service:
    operations:
      getCustomer:
        profile: low-latency-read
      updateCustomerStatus:
        profile: critical-command

Named profiles improve consistency.

Operation overrides remain possible, but visible.

15. Java Configuration Shape

A useful model is a set of immutable records.

public record ClientConfig(
        String name,
        URI baseUri,
        Map<String, OperationConfig> operations,
        TimeoutPolicy defaultTimeout,
        RetryPolicy defaultRetry,
        CircuitBreakerPolicy defaultCircuitBreaker,
        BulkheadPolicy defaultBulkhead,
        PayloadPolicy defaultPayload
) {}

public record OperationConfig(
        String operationName,
        String method,
        String pathTemplate,
        TimeoutPolicy timeout,
        RetryPolicy retry,
        CircuitBreakerPolicy circuitBreaker,
        BulkheadPolicy bulkhead,
        PayloadPolicy payload,
        ObservabilityPolicy observability
) {}

Avoid passing raw Duration and integer properties everywhere.

Strong types reveal intent.

15.1 Retry Policy Type

public record RetryPolicy(
        int maxAttempts,
        Duration initialBackoff,
        Duration maxBackoff,
        boolean jitterEnabled,
        boolean requiresIdempotencyKey,
        Set<Integer> retryableStatuses,
        Set<Class<? extends Throwable>> retryableExceptions
) {
    public boolean enabled() {
        return maxAttempts > 1;
    }
}

15.2 Bulkhead Policy Type

public record BulkheadPolicy(
        int maxConcurrentCalls,
        Duration maxWaitDuration
) {
    public boolean failFast() {
        return maxWaitDuration.isZero();
    }
}

15.3 Payload Policy Type

public record PayloadPolicy(
        long maxRequestBytes,
        long maxResponseBytes,
        boolean requestCompressionEnabled,
        boolean responseCompressionEnabled
) {}

16. Spring Boot Binding Example

@ConfigurationProperties(prefix = "communication")
public record CommunicationProperties(
        Map<String, ClientProperties> clients,
        Map<String, PolicyProfile> profiles
) {}

public record ClientProperties(
        URI baseUrl,
        String defaultProfile,
        Map<String, OperationProperties> operations
) {}

public record OperationProperties(
        String method,
        String pathTemplate,
        String profile,
        TimeoutProperties timeoutOverride,
        RetryProperties retryOverride
) {}

Example YAML:

communication:
  profiles:
    low-latency-read:
      timeout:
        connect: 100ms
        request: 300ms
        pool-acquire: 20ms
      retry:
        max-attempts: 2
        initial-backoff: 25ms
        max-backoff: 100ms
        jitter-enabled: true

  clients:
    customer-service:
      base-url: http://customer-service.default.svc.cluster.local
      default-profile: low-latency-read
      operations:
        getCustomer:
          method: GET
          path-template: /customers/{customerId}
          profile: low-latency-read

Bind configuration early at startup.

Validate aggressively.

Invalid communication policy should fail startup, not fail during an incident.

17. Configuration Validation

Add validation rules.

public final class CommunicationPolicyValidator {
    public void validate(OperationConfig operation) {
        requirePositive(operation.timeout().requestTimeout(), "request timeout");
        requirePositive(operation.timeout().connectTimeout(), "connect timeout");

        if (operation.retry().maxAttempts() > 1 && !operation.isRetrySafe()) {
            throw new IllegalArgumentException(
                    operation.operationName() + " retries but is not retry-safe");
        }

        if (operation.timeout().connectTimeout().compareTo(operation.timeout().requestTimeout()) >= 0) {
            throw new IllegalArgumentException("connect timeout must be smaller than request timeout");
        }

        if (operation.bulkhead().maxConcurrentCalls() <= 0) {
            throw new IllegalArgumentException("bulkhead maxConcurrentCalls must be positive");
        }
    }
}

Validation examples:

Rule	Why
`connectTimeout < requestTimeout`	Otherwise no useful request budget remains
`maxAttempts >= 1`	Attempt count includes first try
Retry command requires idempotency key	Prevent duplicate side effects
Bulkhead max wait <= request timeout	Avoid hidden wait exceeding operation budget
Response size max configured	Prevent unbounded memory usage
Operation name present	Required for metrics/tracing cardinality control

18. Combining Policies in the Call Pipeline

The order matters.

A typical synchronous pipeline:

18.1 Why Bulkhead Before HTTP?

Because you want to reject before consuming transport resources.

18.2 Why Circuit Breaker Before Attempt?

Because when the circuit is open, no attempt should be made.

18.3 Why Retry Inside Deadline?

Because retry must obey the original operation budget.

18.4 Why Telemetry Around Everything?

Because pool exhaustion, circuit open, and rate-limit rejection are client outcomes even when no HTTP request was sent.

19. Example: Policy Executor

public final class HttpOperationExecutor {
    private final HttpTransport transport;
    private final RetryDecider retryDecider;
    private final Telemetry telemetry;

    public <T> T execute(OperationConfig operation, HttpCall<T> call, Deadline parentDeadline) {
        Deadline deadline = Deadline.min(
                parentDeadline,
                Deadline.after(operation.timeout().requestTimeout())
        );

        return telemetry.observe(operation, () -> {
            try (BulkheadPermit ignored = acquireBulkhead(operation, deadline)) {
                checkCircuitBreaker(operation);
                return executeAttempts(operation, call, deadline);
            }
        });
    }

    private <T> T executeAttempts(OperationConfig operation, HttpCall<T> call, Deadline deadline) {
        int attempt = 1;
        while (true) {
            ClientAttemptResult<T> result = transport.send(call, operation, deadline, attempt);
            if (!retryDecider.shouldRetry(result, operation, deadline, attempt)) {
                return result.orThrowMappedException();
            }
            sleepWithinDeadline(operation.retry().backoffFor(attempt), deadline);
            attempt++;
        }
    }
}

The code is intentionally schematic.

The point is the design:

deadline is explicit
policy is operation-specific
retry is centralized
telemetry surrounds client-side outcomes
result mapping happens after classification

20. Resilience4j Integration Shape

Resilience4j provides decorators for CircuitBreaker, Retry, RateLimiter, Bulkhead, and TimeLimiter. That makes it useful for building a consistent outbound client policy.

But do not blindly stack annotations.

Be clear about order.

Example with decorators:

Supplier<CustomerSnapshot> supplier = () -> customerTransport.getCustomer(customerId, deadline);

Supplier<CustomerSnapshot> guarded = Decorators.ofSupplier(supplier)
        .withBulkhead(customerBulkhead)
        .withCircuitBreaker(customerCircuitBreaker)
        .withRetry(customerRetry)
        .decorate();

return guarded.get();

The exact order should be tested and documented.

For many teams, a custom executor around Resilience4j primitives is easier to reason about than annotation composition scattered across classes.

21. Dynamic Configuration

Dynamic config is powerful and dangerous.

You may want to change:

timeout
circuit breaker threshold
retry attempts
rate limit
bulkhead limit

without redeploying.

But some changes can destabilize the system.

21.1 Safe Dynamic Changes

Usually safe:

lower retry attempts
lower timeout
open circuit manually
reduce rate limit
reduce max response size

Potentially risky:

increase retry attempts
increase timeout massively
increase bulkhead concurrency
disable circuit breaker
enable request compression for all calls

Runtime changes should pass validation and audit.

config change = production event

It should be visible in logs, metrics annotations, dashboards, and incident timelines.

22. Configuration Ownership

Who owns the client config?

Bad answer:

Everyone edits YAML when they need to.

Better answer:

Config area	Owner
Operation semantics	Consuming service team
Retry safety	Consuming and providing teams agree
Downstream capacity limit	Providing service team publishes guidance
Timeout/SLO budget	Consuming service owns, aligned with platform SLO
Circuit threshold	Consuming service owns, platform advises
Gateway/mesh policy	Platform team owns
Global safety ceiling	Platform/SRE owns

The client is in the caller process.

But the dependency contract is shared.

23. Configuration Drift

Configuration drift happens when different services call the same dependency with wildly different assumptions.

Example:

Caller	Timeout	Retry	Bulkhead
Order Service	300ms	1 retry	50
Billing Service	5s	3 retries	500
Report Service	30s	5 retries	unlimited

During dependency slowdown, the aggressive callers dominate capacity.

Prevent this with:

shared policy profiles
dependency owner guidance
config linting
runtime dashboards by caller/dependency/operation
incident review on retry/timeout behavior

24. Anti-Patterns

24.1 One Client Bean for Everything

@Bean
RestClient restClient() {
    return RestClient.create();
}

This encourages all dependencies to share behavior.

Instead, create dependency-specific clients or policy-specific builders.

24.2 Retry Annotation on Interface Method

@Retry(name = "default")
Customer getCustomer(String id);

This hides operation semantics.

The annotation does not explain whether the operation is idempotent, how deadline is enforced, or whether retry budget exists.

24.3 Infinite Queue Behind Bulkhead

A queue turns overload into latency.

Latency then creates more concurrency.

Concurrency creates more latency.

That is a positive feedback loop.

24.4 Same Timeout Everywhere

A single timeout means no design happened.

24.5 Copy-All Headers

Propagating all inbound headers to downstream services leaks security, privacy, routing, and observability problems.

Use allowlists.

24.6 Generated Client Owns Policy

Generated clients should know paths and schemas.

They should not own resilience behavior.

Wrap them.

25. Production Checklist

Before approving an outbound client, ask:

26. Reference Configuration Template

communication:
  global:
    max-request-timeout: 5s
    default-safety-margin: 25ms
    max-response-bytes: 1048576

  profiles:
    internal-read:
      timeout:
        connect: 100ms
        pool-acquire: 20ms
        request: 300ms
        safety-margin: 25ms
      retry:
        max-attempts: 2
        initial-backoff: 25ms
        max-backoff: 100ms
        jitter: true
        retry-on-status: [502, 503, 504]
      circuit-breaker:
        enabled: true
        sliding-window-size: 100
        minimum-calls: 50
        failure-rate-threshold: 50
        slow-call-duration-threshold: 250ms
        slow-call-rate-threshold: 60
      bulkhead:
        max-concurrent-calls: 80
        max-wait: 0ms
      payload:
        max-request-bytes: 65536
        max-response-bytes: 262144

    idempotent-command:
      timeout:
        connect: 150ms
        pool-acquire: 50ms
        request: 800ms
        safety-margin: 30ms
      retry:
        max-attempts: 2
        requires-idempotency-key: true
        initial-backoff: 50ms
        max-backoff: 150ms
        jitter: true
        retry-on-status: [502, 503, 504]
      circuit-breaker:
        enabled: true
        sliding-window-size: 100
        minimum-calls: 30
        failure-rate-threshold: 40
      bulkhead:
        max-concurrent-calls: 30
        max-wait: 0ms

  clients:
    customer-service:
      base-url: http://customer-service.default.svc.cluster.local
      default-profile: internal-read
      discovery:
        mode: kubernetes-dns
        expects-mesh-routing: true
      operations:
        getCustomer:
          method: GET
          path-template: /customers/{customerId}
          profile: internal-read
        updateCustomerStatus:
          method: PUT
          path-template: /customers/{customerId}/status
          profile: idempotent-command
          idempotency-key:
            required: true

This is not meant to be copied blindly.

It is a shape.

Adapt the numbers to your SLO, load, client library, platform, and downstream contract.

27. Exercises

Exercise 1 — Classify Operations

For each operation in one of your services, classify:

Operation	Safe?	Idempotent?	Retryable?	Timeout	Bulkhead
`getCustomer`	yes	yes	yes	?	?
`createPayment`	no	only with key	maybe	?	?
`sendNotification`	no	only with message id	maybe	?	?

If you cannot answer, the client is not production-ready.

Exercise 2 — Find Hidden Queues

Look for queues in:

servlet thread pool
WebClient/Reactor scheduler
connection pool pending acquisition
bulkhead
rate limiter
retry executor
service mesh proxy
downstream queue

Every hidden queue should have:

max size
wait time
metric
owner
failure mode

Exercise 3 — Rewrite a Generic Client

Take a generic client method:

CustomerDto getCustomer(String id);

Rewrite it as:

CustomerLookupResult getCustomer(CustomerId id, RequestContext context);

Then define:

timeout policy
retry policy
circuit breaker classifier
error mapping
observability attributes

28. Summary

A Java HTTP client is not production-grade because it can send a request.

It becomes production-grade when communication behavior is explicit.

The minimum useful configuration model defines:

dependency and operation names
timeout and deadline behavior
retry eligibility, backoff, jitter, and budget
circuit breaker classification
bulkhead and pool limits
rate limiting
payload bounds
redirect/TLS/proxy/header policy
error mapping
observability labels
validation and ownership

The most important mental shift:

Do not configure clients by library knobs. Configure them by dependency behavior.

Part 026 builds the next layer: client-side observability.

A policy you cannot observe is a policy you cannot trust.

Lesson Recap

You just completed lesson 25 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 24

Designing Client Abstraction Boundaries

Next Lesson

Lesson 26

Client-Side Observability