Spring WebClient: Reactive HTTP Without Magical Thinking
Learn Java Microservices Communication - Part 021
Production-grade guide to Spring WebClient for reactive and non-blocking HTTP communication in Java microservices.
Part 021 — Spring WebClient: Reactive HTTP Without Magical Thinking
WebClient is Spring's reactive HTTP client.
That statement is true, but incomplete.
In production microservices, WebClient is not valuable because it has a fluent API. RestClient also has a fluent API. WebClient is valuable when the communication problem benefits from non-blocking composition, streaming, high concurrency with constrained threads, or reactive backpressure-aware integration.
The wrong reason to use WebClient:
"It is newer, therefore it is better."
The right reason:
"The caller needs to compose many remote operations, stream data, or avoid tying request concurrency to one blocking thread per call."
Spring's own reference describes WebClient as non-blocking, reactive, supporting streaming, and using codecs shared with WebFlux server infrastructure. That gives it real capability, but also real failure modes.
This part is about using WebClient as a communication primitive, not as a syntax upgrade.
1. Core Mental Model
WebClient is best understood as this:
WebClient
= HTTP request builder
+ reactive publisher API
+ codec layer
+ exchange function
+ underlying HTTP client connector
+ Reactor composition model
A typical call:
Mono<ProductView> productMono = webClient.get()
.uri("/v1/products/{id}", productId)
.retrieve()
.bodyToMono(ProductView.class);
This does not execute immediately.
It creates a reactive pipeline. Execution happens when there is subscription.
That one detail changes the mental model:
RestClient call:
method invocation performs network I/O now
WebClient call:
method invocation describes network I/O that will happen later on subscription
So the question is not merely:
Can I call another service?
The real questions are:
Who subscribes?
Which thread executes what?
Where is the timeout applied?
Where are errors mapped?
How is cancellation propagated?
How is context propagated?
How much response body can be buffered?
How is concurrency bounded?
If those questions are not answered, using WebClient may make the system harder to reason about than a blocking client.
2. When WebClient Is a Good Fit
Use WebClient when at least one of these is true:
| Need | Why WebClient Helps |
|---|---|
| Many concurrent outbound calls | Avoids one blocking worker thread per in-flight call |
| Streaming response body | Flux<T> maps naturally to streaming consumption |
| Reactive service stack | Avoids blocking inside event-loop/reactive handlers |
| Request composition | zip, flatMap, concatMap, timeout, retryWhen model complex flows |
| Cancellation matters | Reactive cancellation can cancel downstream work if wired correctly |
| Backpressure-aware integration | Can cooperate with downstream consumers instead of buffering everything |
Examples:
API aggregator calling 6 services concurrently
search service streaming partial results
gateway transforming streaming responses
notification service fan-out with bounded concurrency
BFF layer composing multiple internal APIs
Do not use WebClient merely because:
it looks modern
it avoids writing loops
it might be faster
someone said RestTemplate is old
Reactive code can be worse than blocking code if the team does not understand subscription, scheduler boundaries, backpressure, and error channels.
3. When WebClient Is a Bad Fit
WebClient is often the wrong abstraction when:
| Situation | Why It Can Hurt |
|---|---|
| Simple sequential service call | Adds cognitive overhead without benefit |
| Team is fully synchronous | Error handling and debugging become inconsistent |
You immediately call .block() everywhere | You pay reactive complexity but keep blocking behavior |
| You run inside servlet thread and do one call | RestClient may be clearer |
| You need strict imperative transaction boundaries | Reactive transaction context requires different discipline |
| You cannot bound concurrency | Non-blocking can amplify load faster than blocking code |
The most common production anti-pattern:
ProductView product = webClient.get()
.uri("/v1/products/{id}", productId)
.retrieve()
.bodyToMono(ProductView.class)
.block();
This is not automatically illegal.
But if this is the default style across the codebase, the service is not really reactive. It is a blocking service using a reactive client as a complicated HTTP library.
A better default in synchronous applications is usually:
RestClient / JDK HttpClient / Feign
A better default in reactive applications is:
WebClient end-to-end, no accidental blocking
4. The Dangerous Simplicity of retrieve()
Basic usage:
Mono<ProductView> result = webClient.get()
.uri("/v1/products/{id}", productId)
.retrieve()
.bodyToMono(ProductView.class);
This is concise, but it hides several decisions:
How are 4xx mapped?
How are 5xx mapped?
Is a 404 domain-empty or error?
How large can the body be?
What if body decoding fails?
What if the server returns text/html?
What if timeout happens after the server committed the command?
What retry policy is safe?
A production client must make these decisions explicit.
Example:
public Mono<ProductView> getProduct(ProductId productId, RequestContext context) {
return webClient.get()
.uri(uriBuilder -> uriBuilder
.path("/v1/products/{id}")
.build(productId.value()))
.header("X-Request-Id", context.requestId())
.header("X-Correlation-Id", context.correlationId())
.retrieve()
.onStatus(HttpStatusCode::is4xxClientError, response -> map4xx(response, productId))
.onStatus(HttpStatusCode::is5xxServerError, this::map5xx)
.bodyToMono(ProductView.class)
.timeout(Duration.ofMillis(350))
.name("catalog.getProduct")
.tag("peer.service", "catalog");
}
Even this is incomplete unless concurrency, retry, connection pool, and observability are configured consistently.
5. retrieve() vs exchangeToMono()
Use retrieve() for ordinary success-body mapping with simple status handling.
Use exchangeToMono() when the response handling depends on the whole response:
public Mono<Optional<ProductView>> findProduct(ProductId id) {
return webClient.get()
.uri("/v1/products/{id}", id.value())
.exchangeToMono(response -> {
if (response.statusCode().is2xxSuccessful()) {
return response.bodyToMono(ProductView.class).map(Optional::of);
}
if (response.statusCode().value() == 404) {
return Mono.just(Optional.empty());
}
return response.bodyToMono(ProblemDetailEnvelope.class)
.defaultIfEmpty(ProblemDetailEnvelope.unknown())
.flatMap(problem -> Mono.error(ProductClientException.from(response.statusCode(), problem)));
});
}
The advantage is explicit semantics:
200 => product exists
404 => optional empty
409 => domain conflict
422 => invalid command/query
429 => throttled
503 => overloaded / dependency unavailable
A service client should translate raw protocol response into domain-relevant result types.
6. Reactive Error Channel
In Reactor, failures are not thrown synchronously from the method that creates the pipeline.
They travel through the reactive error channel.
Mono<ProductView> mono = catalogClient.getProduct(id);
// No network call yet. No exception yet.
Errors occur after subscription:
mono.subscribe(
value -> log.info("product={}", value),
error -> log.warn("catalog failed", error)
);
In production service code, you usually do not manually subscribe. The framework subscribes at the edge.
This means local try/catch around pipeline construction is not enough:
try {
return catalogClient.getProduct(id);
} catch (Exception e) {
// Usually useless for async pipeline errors.
}
Use reactive operators:
return catalogClient.getProduct(id)
.onErrorResume(ProductNotFoundException.class, e -> Mono.empty())
.onErrorMap(TimeoutException.class, e -> new CatalogUnavailableException(e));
Failure policy belongs inside the pipeline.
7. Timeouts: Operator Timeout vs Transport Timeout
A WebClient call can fail due to multiple timeout layers:
DNS resolution timeout
connection establishment timeout
TLS handshake timeout
connection pool acquisition timeout
request write timeout
response read timeout
overall reactive pipeline timeout
caller deadline exceeded
A common mistake is to set only:
.timeout(Duration.ofSeconds(2))
This bounds the reactive pipeline, but does not fully describe lower-level transport behavior.
A stronger configuration with Reactor Netty often separates transport settings from operation-level budget:
ConnectionProvider provider = ConnectionProvider.builder("catalog-pool")
.maxConnections(200)
.pendingAcquireMaxCount(500)
.pendingAcquireTimeout(Duration.ofMillis(100))
.maxIdleTime(Duration.ofSeconds(30))
.maxLifeTime(Duration.ofMinutes(5))
.build();
HttpClient httpClient = HttpClient.create(provider)
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 200)
.responseTimeout(Duration.ofMillis(500))
.doOnConnected(conn -> conn
.addHandlerLast(new ReadTimeoutHandler(500, TimeUnit.MILLISECONDS))
.addHandlerLast(new WriteTimeoutHandler(300, TimeUnit.MILLISECONDS)));
WebClient webClient = WebClient.builder()
.baseUrl("http://catalog")
.clientConnector(new ReactorClientHttpConnector(httpClient))
.build();
Then each operation still has its own semantic budget:
return webClient.get()
.uri("/v1/products/{id}", id.value())
.retrieve()
.bodyToMono(ProductView.class)
.timeout(Duration.ofMillis(650));
Do not confuse transport timeout with business operation deadline.
8. Deadline Propagation
Suppose an inbound request has 900 ms remaining.
Your service calls:
catalog
price
inventory
recommendation
If every downstream client has a hard-coded 2-second timeout, the system violates the caller's deadline.
A better model:
incoming remaining budget = 900 ms
service local budget = 100 ms
outbound max budget = 800 ms
For parallel calls:
catalog: 300 ms
price: 300 ms
inventory: 300 ms
optional rec: 150 ms
Example context-aware timeout:
public Mono<ProductPage> assemble(ProductId id, RequestContext ctx) {
Duration remaining = ctx.deadline().remaining();
Duration budget = min(remaining.minusMillis(100), Duration.ofMillis(500));
Mono<ProductView> product = catalogClient.getProduct(id, ctx).timeout(budget);
Mono<PriceView> price = priceClient.getPrice(id, ctx).timeout(budget);
Mono<InventoryView> inventory = inventoryClient.getInventory(id, ctx).timeout(budget);
return Mono.zip(product, price, inventory)
.map(tuple -> ProductPage.of(tuple.getT1(), tuple.getT2(), tuple.getT3()));
}
A deadline is not just a timeout. It is a propagated constraint across the call graph.
9. Retry with Reactor: Be Very Explicit
Reactor makes retry easy:
.retry(3)
That is almost never production-grade.
It retries too mechanically and does not encode:
which errors are retryable
whether the method is safe/idempotent
whether the deadline still has budget
whether backoff/jitter is used
whether retry amplification is acceptable
whether server asked us to slow down
A safer pattern:
Retry retrySpec = Retry.backoff(2, Duration.ofMillis(50))
.maxBackoff(Duration.ofMillis(200))
.jitter(0.5)
.filter(this::isRetryable)
.doBeforeRetry(signal -> log.warn(
"Retrying catalog call attempt={} error={}",
signal.totalRetries() + 1,
signal.failure().toString()
));
return webClient.get()
.uri("/v1/products/{id}", id.value())
.retrieve()
.onStatus(this::isRetriableStatus, this::mapRetryableStatus)
.onStatus(HttpStatusCode::is4xxClientError, this::mapNonRetryable4xx)
.bodyToMono(ProductView.class)
.timeout(Duration.ofMillis(400))
.retryWhen(retrySpec);
Retry eligibility must be based on semantics, not exception type alone.
Good retry candidates:
connection reset before response
connect timeout before request body sent
503 overload with Retry-After, within budget
429 throttled, if caller policy allows delayed retry
idempotent GET with no side effects
idempotent command with idempotency key
Bad retry candidates:
validation failure
authorization failure
business conflict
non-idempotent command without idempotency key
unknown outcome after write completion
large upload mid-stream failure
10. Concurrency Control: flatMap Is a Load Generator
This looks elegant:
Flux.fromIterable(productIds)
.flatMap(catalogClient::getProduct)
.collectList();
But flatMap can create many concurrent downstream calls.
Better:
Flux.fromIterable(productIds)
.flatMap(catalogClient::getProduct, 20)
.collectList();
The second parameter bounds concurrency.
Without a limit, a single inbound request can become a downstream fan-out attack.
Production rule:
Every fan-out must have an explicit concurrency limit.
Example:
public Mono<List<ProductView>> getProducts(List<ProductId> ids) {
return Flux.fromIterable(ids)
.distinct()
.take(100)
.flatMap(id -> getProduct(id)
.onErrorResume(ProductNotFoundException.class, e -> Mono.empty()), 16)
.collectList();
}
Design notes:
distinct() prevents duplicate work
take(100) bounds request expansion
flatMap(..., 16) bounds downstream concurrency
onErrorResume classifies acceptable partial failure
collectList() is safe only if result cardinality is bounded
11. Backpressure Is Not a Magic Shield
Reactive Streams defines a demand-aware contract between publisher and subscriber.
But an HTTP server is not automatically protected just because the client code uses Flux.
Backpressure helps when:
producer can slow down
consumer demand is propagated
intermediate buffers are bounded
transport supports flow control
application code does not collect unbounded data
Backpressure does not help when:
you call collectList() on huge streams
you buffer the entire response body in memory
the remote service ignores flow control
you use unbounded flatMap concurrency
you bridge to blocking queues without limits
Example risky code:
return webClient.get()
.uri("/v1/export")
.retrieve()
.bodyToFlux(ExportRow.class)
.collectList();
This turns a stream into an unbounded memory buffer.
Better:
return webClient.get()
.uri("/v1/export")
.retrieve()
.bodyToFlux(ExportRow.class)
.limitRate(500)
.concatMap(rowWriter::write);
Even better, make the maximum export size and sink backpressure explicit.
12. Response Body Size and Codec Limits
A production client must bound response buffering.
Spring's WebClient codec configuration can control in-memory limits:
ExchangeStrategies strategies = ExchangeStrategies.builder()
.codecs(configurer -> configurer.defaultCodecs().maxInMemorySize(512 * 1024))
.build();
WebClient webClient = WebClient.builder()
.exchangeStrategies(strategies)
.baseUrl("http://catalog")
.build();
This limit matters because bodyToMono(SomeDto.class) requires decoding the body. If the body is unexpectedly large, a client without a limit can become a memory-pressure vector.
Payload policy should include:
maximum response body size
maximum error body size
maximum collection cardinality
streaming vs buffering decision
codec whitelist
content-type validation
compression ratio guard
Large responses are an API design smell unless explicitly modeled as streaming, pagination, or file transfer.
13. Context Propagation
Thread-local context is fragile in reactive code.
In a blocking servlet stack, code often relies on ThreadLocal:
current user
request id
trace id
tenant id
locale
deadline
In reactive pipelines, execution may move across threads. Reactor provides a context model, and observability libraries integrate with it, but application code must avoid hidden assumptions.
Bad:
String correlationId = CorrelationContextHolder.get(); // ThreadLocal
return webClient.get()
.uri("/v1/products/{id}", id)
.header("X-Correlation-Id", correlationId)
.retrieve()
.bodyToMono(ProductView.class);
Better:
public Mono<ProductView> getProduct(ProductId id) {
return Mono.deferContextual(ctx -> {
RequestContext requestContext = ctx.get(RequestContext.class);
return webClient.get()
.uri("/v1/products/{id}", id.value())
.header("X-Correlation-Id", requestContext.correlationId())
.header("X-Request-Id", requestContext.requestId())
.retrieve()
.bodyToMono(ProductView.class);
});
}
At the edge:
return handler(request)
.contextWrite(ctx -> ctx.put(RequestContext.class, requestContext));
The invariant:
context propagation must not depend on accidental thread affinity
14. Filters as Client Middleware
ExchangeFilterFunction is WebClient's middleware hook.
It can add headers, log, tag metrics, enforce policy, or map behavior.
Example header propagation filter:
public final class ContextPropagationFilter {
public static ExchangeFilterFunction propagate() {
return (request, next) -> Mono.deferContextual(contextView -> {
RequestContext ctx = contextView.getOrDefault(
RequestContext.class,
RequestContext.anonymous()
);
ClientRequest mutated = ClientRequest.from(request)
.header("X-Request-Id", ctx.requestId())
.header("X-Correlation-Id", ctx.correlationId())
.header("X-Deadline-Ms", Long.toString(ctx.deadline().epochMillis()))
.build();
return next.exchange(mutated);
});
}
}
Register it:
WebClient webClient = WebClient.builder()
.baseUrl("http://catalog")
.filter(ContextPropagationFilter.propagate())
.build();
Filters should be small and deterministic.
Avoid filters that:
perform hidden remote calls
block event-loop threads
swallow errors globally
mutate bodies without clear ownership
turn all status codes into generic RuntimeException
15. Blocking Inside Reactive Code
This is one of the most damaging WebClient failure modes:
return webClient.get()
.uri("/v1/products/{id}", id)
.retrieve()
.bodyToMono(ProductView.class)
.map(product -> blockingRepository.save(product));
If blockingRepository.save() blocks an event-loop thread, throughput can collapse.
If blocking cannot be avoided, move it to a bounded scheduler:
return webClient.get()
.uri("/v1/products/{id}", id)
.retrieve()
.bodyToMono(ProductView.class)
.flatMap(product -> Mono.fromCallable(() -> blockingRepository.save(product))
.subscribeOn(Schedulers.boundedElastic()));
But this is not a free pass. Bounded elastic is still capacity that can be exhausted.
Architecture rule:
Do not mix blocking dependencies into reactive request paths without an explicit isolation plan.
16. Cancellation
Reactive cancellation is a real operational capability.
If the client disconnects, the upstream pipeline can be cancelled. If wired correctly, downstream WebClient calls can also be cancelled.
This matters for expensive fan-out or streaming calls.
Example:
return catalogClient.streamProducts(query)
.takeUntil(product -> product.isEnoughForResultPage())
.doOnCancel(() -> log.info("product stream cancelled query={}", query.id()));
But cancellation is not a business transaction rollback.
If a downstream command was already committed, cancelling the caller does not undo it.
For command APIs:
cancellation means the caller stopped waiting
not that the callee did not execute
This is the same unknown-outcome problem from earlier parts.
17. Streaming Responses
WebClient can consume streaming HTTP responses:
public Flux<InventoryEvent> streamInventory(ProductId id) {
return webClient.get()
.uri("/v1/products/{id}/inventory-stream", id.value())
.accept(MediaType.TEXT_EVENT_STREAM)
.retrieve()
.bodyToFlux(InventoryEvent.class);
}
Streaming requires different policies from request/response:
idle timeout instead of short response timeout
heartbeat expectations
reconnect policy
last-seen cursor
bounded downstream processing
backpressure strategy
partial result semantics
cancellation handling
A streaming response is not just a long GET. It is a long-lived communication relationship.
18. Parallel Composition
A common WebClient use case is aggregator services.
Example:
public Mono<ProductPage> getProductPage(ProductId id, RequestContext ctx) {
Mono<ProductView> product = catalogClient.getProduct(id, ctx);
Mono<PriceView> price = priceClient.getPrice(id, ctx);
Mono<InventoryView> inventory = inventoryClient.getInventory(id, ctx);
Mono<RecommendationView> recommendations = recommendationClient.getRecommendations(id, ctx)
.timeout(Duration.ofMillis(120))
.onErrorReturn(RecommendationView.empty());
return Mono.zip(product, price, inventory, recommendations)
.map(t -> ProductPage.of(t.getT1(), t.getT2(), t.getT3(), t.getT4()));
}
This should encode required vs optional dependencies:
catalog required
price required
inventory required
recommendation optional
If all dependencies are zipped without fallback, the slowest required dependency determines latency and any failure fails the whole result.
That may be correct. It must be deliberate.
19. Sequential Composition
Some calls must be sequential because the second depends on the first.
public Mono<OrderQuote> quote(OrderRequest request, RequestContext ctx) {
return customerClient.getCustomer(request.customerId(), ctx)
.flatMap(customer -> priceClient.quote(request, customer.segment(), ctx))
.flatMap(price -> inventoryClient.reservePreview(request.items(), ctx)
.map(inventory -> OrderQuote.of(price, inventory)));
}
Sequential composition increases latency:
T_total = T_customer + T_price + T_inventory
A senior design question:
Is the dependency truly sequential, or did we accidentally design the API so it must be sequential?
Sometimes the better fix is API shape:
price quote endpoint accepts customer segment directly
inventory preview accepts item list directly
customer lookup is cached or embedded in caller context
Communication code reveals API design problems.
20. WebClient and Resilience4j
Resilience4j can decorate reactive pipelines.
Example conceptual usage:
CircuitBreaker circuitBreaker = circuitBreakerRegistry.circuitBreaker("catalog");
TimeLimiter timeLimiter = timeLimiterRegistry.timeLimiter("catalog");
return webClient.get()
.uri("/v1/products/{id}", id.value())
.retrieve()
.bodyToMono(ProductView.class)
.transformDeferred(CircuitBreakerOperator.of(circuitBreaker))
.timeout(Duration.ofMillis(400));
The important point is not the exact library call. It is policy placement:
timeout before retry budget exhaustion
retry only for eligible operations
circuit breaker around downstream dependency
bulkhead around caller's resource consumption
fallback only when semantics allow degraded result
Do not hide resilience annotations at random layers and assume the call graph is safe.
21. Observability
A production WebClient must emit enough telemetry to answer:
Which downstream service is slow?
Which route is failing?
Which status codes increased?
Which exception classes dominate?
Which calls timed out vs were rejected by pool?
How many retries were attempted?
Are we saturating connection pools?
Which trace shows the full call chain?
Minimum dimensions:
peer.service
http.request.method
http.route or templated URI
http.response.status_code
error.type
client.operation
retry.count
circuit.state
outcome
Avoid high-cardinality labels:
raw URL with ids
request id
user id
tenant id if tenant cardinality is huge
full exception message
Good operation naming:
catalog.getProduct
pricing.quote
inventory.reservePreview
Bad operation naming:
GET /v1/products/123456
HTTP call
WebClient request
22. Logging
Log communication facts, not payload dumps.
Good log fields:
operation
peer.service
method
route
status
latency_ms
request_id
correlation_id
attempt
exception_class
retryable
Avoid by default:
Authorization headers
cookies
PII payloads
full request/response bodies
binary bodies
large error payloads
Example:
.doOnEach(signal -> {
if (signal.isOnError()) {
log.warn("catalog call failed operation={} peer={} error={}",
"catalog.getProduct",
"catalog",
signal.getThrowable().getClass().getSimpleName());
}
})
For richer logging, prefer a filter that redacts and caps values.
23. WebClient Bean Design
Do not create a new WebClient for every request.
Prefer a configured bean per downstream service:
@Configuration
class CatalogClientConfiguration {
@Bean
WebClient catalogWebClient(WebClient.Builder builder) {
return builder
.baseUrl("http://catalog")
.defaultHeader(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE)
.filter(ContextPropagationFilter.propagate())
.filter(ObservabilityFilter.clientOperation("catalog"))
.build();
}
}
Then wrap it in a typed client:
@Component
public final class CatalogClient {
private final WebClient webClient;
public CatalogClient(@Qualifier("catalogWebClient") WebClient webClient) {
this.webClient = webClient;
}
public Mono<ProductView> getProduct(ProductId id) {
return webClient.get()
.uri("/v1/products/{id}", id.value())
.retrieve()
.onStatus(status -> status.value() == 404, this::notFound)
.onStatus(HttpStatusCode::isError, this::remoteError)
.bodyToMono(ProductView.class)
.timeout(Duration.ofMillis(400));
}
}
Application code should depend on CatalogClient, not raw WebClient.
24. Client Boundary Design
A typed WebClient wrapper should expose business operations:
interface CatalogClient {
Mono<ProductView> getProduct(ProductId id);
Flux<ProductSummary> searchProducts(ProductSearchQuery query);
}
Not raw HTTP details:
Mono<ResponseEntity<String>> exchange(String method, String url, Map<String, String> headers);
The wrapper owns:
base URL
routes
headers
status mapping
body mapping
timeout
retry eligibility
observability labels
error taxonomy
fallback semantics
Callers should not know whether the downstream uses HTTP, gRPC, or messaging.
25. Error Taxonomy
Map protocol failures into stable exception/result types.
Example:
sealed interface CatalogFailure permits
ProductNotFound,
CatalogValidationFailure,
CatalogConflict,
CatalogThrottled,
CatalogUnavailable,
CatalogProtocolFailure,
CatalogDecodeFailure {
}
For Java exceptions:
public class CatalogUnavailableException extends RuntimeException {
private final boolean retryable;
private final String peerService;
private final String operation;
public CatalogUnavailableException(
String message,
Throwable cause,
boolean retryable,
String peerService,
String operation
) {
super(message, cause);
this.retryable = retryable;
this.peerService = peerService;
this.operation = operation;
}
}
Avoid exposing low-level exceptions across application boundaries:
WebClientResponseException
PrematureCloseException
ReadTimeoutException
DecoderException
Those are implementation details.
26. Handling Problem Details
If the server uses RFC 9457 Problem Details, parse it explicitly:
private Mono<? extends Throwable> remoteError(ClientResponse response) {
return response.bodyToMono(RemoteProblem.class)
.defaultIfEmpty(RemoteProblem.unknown())
.map(problem -> switch (response.statusCode().value()) {
case 400, 422 -> new CatalogValidationException(problem);
case 404 -> new ProductNotFoundException(problem);
case 409 -> new CatalogConflictException(problem);
case 429 -> new CatalogThrottledException(problem);
case 503 -> new CatalogUnavailableException(problem, true);
default -> new CatalogProtocolException(problem, response.statusCode().value());
});
}
Error response mapping should preserve:
status code
problem type
problem title/detail if safe
remote error code
request id / trace id
retryability
operation
peer service
Never require callers to parse raw response bodies to understand failure semantics.
27. Testing WebClient Clients
Test the typed client, not Spring internals.
Good tests:
200 maps to DTO
404 maps to Optional.empty or not-found exception
409 maps to conflict exception
429 is retryable/throttled
503 is unavailable
malformed JSON maps to decode failure
large response is rejected
request includes trace/correlation headers
operation timeout is applied
retry does not happen for non-idempotent call
Use an HTTP stub server such as WireMock, MockWebServer, or Spring test support.
Example structure:
@Test
void mapsNotFoundToEmpty() {
server.enqueue(response(404, problemJson("product.not_found")));
Mono<Optional<ProductView>> result = client.findProduct(ProductId.of("p-1"));
StepVerifier.create(result)
.expectNext(Optional.empty())
.verifyComplete();
}
Reactive clients should be tested with StepVerifier or equivalent reactive-aware assertions.
Do not hide asynchronous behavior behind .block() in every test unless the production boundary is explicitly blocking.
28. Production WebClient Template
Below is a realistic skeleton.
@Configuration
class DownstreamHttpConfiguration {
@Bean
ConnectionProvider catalogConnectionProvider() {
return ConnectionProvider.builder("catalog-pool")
.maxConnections(150)
.pendingAcquireMaxCount(300)
.pendingAcquireTimeout(Duration.ofMillis(100))
.maxIdleTime(Duration.ofSeconds(30))
.maxLifeTime(Duration.ofMinutes(5))
.build();
}
@Bean
WebClient catalogWebClient(
WebClient.Builder builder,
ConnectionProvider catalogConnectionProvider
) {
HttpClient httpClient = HttpClient.create(catalogConnectionProvider)
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 200)
.responseTimeout(Duration.ofMillis(600))
.doOnConnected(connection -> connection
.addHandlerLast(new ReadTimeoutHandler(600, TimeUnit.MILLISECONDS))
.addHandlerLast(new WriteTimeoutHandler(300, TimeUnit.MILLISECONDS)));
ExchangeStrategies strategies = ExchangeStrategies.builder()
.codecs(configurer -> configurer.defaultCodecs().maxInMemorySize(512 * 1024))
.build();
return builder
.baseUrl("http://catalog")
.clientConnector(new ReactorClientHttpConnector(httpClient))
.exchangeStrategies(strategies)
.defaultHeader(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE)
.filter(ContextPropagationFilter.propagate())
.filter(ObservabilityFilter.client("catalog"))
.build();
}
}
Typed client:
@Component
public final class CatalogWebClient {
private final WebClient webClient;
private final Retry productLookupRetry;
public CatalogWebClient(@Qualifier("catalogWebClient") WebClient webClient) {
this.webClient = webClient;
this.productLookupRetry = Retry.backoff(2, Duration.ofMillis(40))
.maxBackoff(Duration.ofMillis(160))
.jitter(0.5)
.filter(this::isRetryable);
}
public Mono<ProductView> getProduct(ProductId id) {
return webClient.get()
.uri("/v1/products/{id}", id.value())
.retrieve()
.onStatus(status -> status.value() == 404, this::notFound)
.onStatus(HttpStatusCode::is4xxClientError, this::clientError)
.onStatus(HttpStatusCode::is5xxServerError, this::serverError)
.bodyToMono(ProductView.class)
.timeout(Duration.ofMillis(450))
.retryWhen(productLookupRetry)
.name("catalog.getProduct")
.tag("peer.service", "catalog");
}
private boolean isRetryable(Throwable throwable) {
return throwable instanceof CatalogUnavailableException e && e.retryable();
}
private Mono<? extends Throwable> notFound(ClientResponse response) {
return response.releaseBody()
.then(Mono.just(new ProductNotFoundException()));
}
private Mono<? extends Throwable> clientError(ClientResponse response) {
return response.bodyToMono(RemoteProblem.class)
.defaultIfEmpty(RemoteProblem.unknown())
.map(problem -> new CatalogClientException(problem, response.statusCode().value()));
}
private Mono<? extends Throwable> serverError(ClientResponse response) {
return response.bodyToMono(RemoteProblem.class)
.defaultIfEmpty(RemoteProblem.unknown())
.map(problem -> new CatalogUnavailableException(problem, true));
}
}
This is still only a skeleton. Production values must be derived from SLO, latency distribution, downstream capacity, and retry budget.
29. Decision Table: RestClient vs WebClient
| Question | Prefer RestClient | Prefer WebClient |
|---|---|---|
| Application is synchronous servlet-style | Yes | Maybe |
| Application is fully reactive | No | Yes |
| One simple blocking call | Yes | Usually no |
| Many parallel calls | Maybe | Yes |
| Streaming response | No | Yes |
| Need reactive composition | No | Yes |
| Team understands Reactor well | Optional | Required |
| Need minimal cognitive overhead | Yes | No |
| Need backpressure-aware pipeline | No | Yes |
You will call .block() everywhere | Yes, use RestClient | No |
The correct choice is architectural, not fashionable.
30. Failure Modes Checklist
Before approving a WebClient-based client, verify:
[ ] one configured WebClient per downstream service
[ ] no per-request WebClient construction
[ ] base URL externalized
[ ] connection pool configured and bounded
[ ] pending acquire timeout configured
[ ] connect timeout configured
[ ] response/read/write timeout configured
[ ] operation-level timeout configured
[ ] max in-memory response size configured
[ ] status code mapping explicit
[ ] Problem Details parsed or error body capped
[ ] retry policy semantic and bounded
[ ] retry uses backoff and jitter
[ ] fan-out concurrency bounded
[ ] no accidental blocking on event-loop threads
[ ] context propagation does not depend on ThreadLocal
[ ] trace/correlation headers propagated deliberately
[ ] metrics use templated routes, not raw URLs
[ ] tests cover failure mapping and timeouts
[ ] streaming paths define cancellation and idle timeout behavior
31. Mental Model Summary
WebClient is not "better RestTemplate".
It is a different execution model.
Use it when you want:
non-blocking I/O
reactive composition
bounded concurrent fan-out
streaming
cancellation-aware pipelines
backpressure-aware processing
Avoid it when it becomes:
blocking code with reactive syntax
unbounded fan-out generator
hidden thread model
error handling maze
ThreadLocal trap
payload buffering risk
The production rule:
WebClient is safe only when execution, concurrency, timeout, retry, body size, context, and observability policies are explicit.
That is the difference between reactive HTTP and reactive-shaped incident generation.
References
- Spring Framework Reference — WebClient.
- Spring Framework Reference — WebClient configuration, codecs, Reactor Netty, resources, and timeouts.
- Reactive Streams specification.
- Reactor reference documentation.
- RFC 9110 — HTTP Semantics.
- RFC 9457 — Problem Details for HTTP APIs.
- OpenTelemetry semantic conventions for HTTP client telemetry.
- AWS Builders Library — Timeouts, retries, and backoff with jitter.
You just completed lesson 21 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.