Series/Learn Java Microservices Communication

Build CoreOrdered learning track

Circuit Breaker Design with Resilience4j

Learn Java Microservices Communication - Part 041

Circuit breaker design for Java microservices with Resilience4j: state machine, failure rate, slow call rate, sliding windows, half-open probes, exception classification, composition with timeout/retry/bulkhead, observability, testing, and production policy.

[2026-07-05]16 min read3176 words

In This Lesson

1. The Core Mental Model 2. Why Circuit Breaker Exists 3. Circuit Breaker Is Not Retry

PrevNext

Lesson 4196 lesson track18–52 Build Core

#java#microservices#communication#resilience+4 more

Part 041 — Circuit Breaker Design with Resilience4j

A circuit breaker is not a magic shield.

It does not make a broken dependency healthy.

It does not make an unsafe command safe.

It does not replace timeout, retry, bulkhead, fallback, or capacity planning.

A circuit breaker does one specific thing:

It stops sending calls to a dependency that is already showing sustained failure or unacceptable slowness.

That sounds small.

In a distributed system, it is critical.

Without a circuit breaker, every caller continues to spend resources on a dependency that is unlikely to succeed. Those wasted calls consume threads, sockets, connection pools, CPU, queues, retries, and user latency budget.

A circuit breaker converts repeated expensive failure into fast, classified failure.

That is failure containment.

1. The Core Mental Model

Imagine a caller repeatedly invoking a dependency.

When the dependency is healthy, calls pass through.

When enough calls fail or become too slow, the breaker opens.

The states mean:

State	Meaning	Caller behavior
`CLOSED`	Dependency assumed healthy	Calls are allowed
`OPEN`	Dependency assumed unhealthy	Calls are rejected immediately
`HALF_OPEN`	Dependency is being tested	Limited probe calls are allowed

Resilience4j also has special states such as DISABLED, FORCED_OPEN, and METRICS_ONLY, but the production mental model is still closed/open/half-open.

2. Why Circuit Breaker Exists

Remote calls fail differently from local calls.

A local method failure is usually cheap.

A remote call failure may require:

connection acquisition,
DNS lookup,
TCP connect,
TLS handshake,
request serialization,
proxy routing,
server queueing,
server execution,
timeout waiting,
response parsing,
retry attempt.

If a dependency is already failing, repeating this full process for every request wastes resources.

Circuit breaker short-circuits the call.

dependency unhealthy
→ do not spend full remote-call cost
→ fail fast
→ protect caller
→ give dependency time to recover

Martin Fowler describes the circuit breaker as wrapping a protected function call, monitoring failures, and tripping once failures reach a threshold so future calls return without invoking the protected function.

3. Circuit Breaker Is Not Retry

Retry and circuit breaker solve different problems.

Pattern	Question
Retry	Should this failed attempt be tried again?
Circuit breaker	Should this dependency be called at all right now?

Retry is optimistic.

Circuit breaker is defensive.

A typical interaction:

Bad design:

retry forever behind a breaker that never opens

Also bad:

breaker opens on one failure in low traffic

Good design:

small bounded retry for transient failure
+ circuit breaker for sustained failure
+ bulkhead to isolate capacity
+ fallback or fail-fast response

4. Circuit Breaker Is Not Timeout

Timeout bounds one call.

Circuit breaker uses outcomes from many calls.

Pattern	Scope
Timeout	One attempt
Circuit breaker	Rolling health of dependency/operation

Without timeout, a call may hang too long before the breaker can classify it.

Without circuit breaker, many calls can repeatedly time out.

They are complementary.

timeout -> turns slow call into bounded failure
circuit breaker -> stops repeated bounded failures

5. Circuit Breaker Is Not Bulkhead

Bulkhead limits concurrent resource usage.

Circuit breaker stops calls based on failure health.

Pattern	Protects against
Bulkhead	One dependency consuming too many caller resources
Circuit breaker	Repeated calls to unhealthy dependency
Timeout	One call taking too long
Retry	One transient failure
Rate limiter	Caller sending too much traffic

If a dependency is slow but still below breaker threshold, bulkhead still protects the caller.

If bulkhead is saturated, circuit breaker may not know the dependency is failing; it only sees rejected local calls if configured to record them.

Again: complementary, not substitutes.

6. What Counts as Failure?

This is the most important design decision.

Not every exception should open the breaker.

Failure categories:

Failure	Count as breaker failure?	Reason
`400 Bad Request`	No	Caller bug, not dependency health
`401 Unauthorized`	Usually no	Credential/config issue
`403 Forbidden`	No	Authorization decision
`404 Not Found`	Usually no	Domain result
`409 Conflict` domain conflict	No	Business conflict
`409 Request in progress` from dedup	Usually no	Retry/dedup state, not health
`422 Domain validation`	No	Caller/domain problem
`429 Too Many Requests`	Maybe	Dependency throttling; may indicate overload
`500 Internal Server Error`	Yes if provider fault
`502 Bad Gateway`	Yes
`503 Service Unavailable`	Yes
`504 Gateway Timeout`	Yes, with unknown outcome caution
connect timeout	Yes
read timeout	Yes
pool acquisition timeout	Maybe, but caller-side saturation
bulkhead full	Usually no for dependency breaker
circuit open	Not a remote failure; do not double count blindly

A circuit breaker should reflect dependency health, not caller mistakes.

If caller sends invalid requests and gets many 400s, opening the breaker would hide a caller bug and block valid traffic.

7. Exception Classification in Java

Use explicit classification.

public final class CircuitBreakerFailureClassifier {
    public boolean shouldRecordFailure(Throwable throwable) {
        if (throwable instanceof RemoteValidationException) {
            return false;
        }

        if (throwable instanceof RemoteAuthenticationException) {
            return false;
        }

        if (throwable instanceof RemoteAuthorizationException) {
            return false;
        }

        if (throwable instanceof RemoteDomainConflictException) {
            return false;
        }

        if (throwable instanceof RemoteRateLimitedException) {
            return true;
        }

        if (throwable instanceof RemoteDependencyUnavailableException) {
            return true;
        }

        if (throwable instanceof RemoteTimeoutException) {
            return true;
        }

        return true;
    }
}

Resilience4j supports predicates such as:

recordException,
ignoreException,
recordExceptions,
ignoreExceptions.

Do not rely on default exception classification for production-grade semantics.

8. Failure Rate Threshold

A circuit breaker should not open after one random failure.

It should open after enough evidence.

Resilience4j calculates failure rate when a minimum number of calls has been recorded.

Example config:

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
    .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
    .slidingWindowSize(100)
    .minimumNumberOfCalls(50)
    .failureRateThreshold(50.0f)
    .build();

Meaning:

look at last 100 calls
do not calculate until at least 50 calls exist
open if >= 50% are failures

This avoids opening due to tiny sample size.

But beware low-traffic services.

If an operation receives 5 calls/minute, minimumNumberOfCalls=100 may delay breaker reaction too long.

9. Count-Based vs Time-Based Sliding Window

Resilience4j supports count-based and time-based sliding windows.

Count-based

last N calls

Good when:

traffic rate is stable,
you want fixed sample size,
operation gets enough calls.

Risk:

under low traffic, old failures remain influential for a long time,
under high traffic, window covers a very short time.

Time-based

last N seconds

Good when:

you want time-local health,
traffic rate varies,
operational dashboards are time-based.

Risk:

low traffic may still have insufficient samples,
threshold may be noisy without minimumNumberOfCalls.

Decision:

Traffic pattern	Better starting point
high, stable traffic	count-based or time-based both ok
low traffic	time-based with careful minimum calls
bursty traffic	time-based often easier
batch jobs	count-based may be clearer
critical command API	conservative threshold + alerts

10. Slow Call Rate

Failure is not only exception.

A dependency that becomes very slow can cause cascading failure before it returns errors.

Resilience4j supports slow call rate.

Example:

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
    .slowCallDurationThreshold(Duration.ofMillis(500))
    .slowCallRateThreshold(50.0f)
    .minimumNumberOfCalls(50)
    .build();

Meaning:

a call slower than 500 ms is slow
if >= 50% of measured calls are slow, open breaker

Slow-call circuit breaking is powerful.

It protects callers before hard failures happen.

But tune carefully:

too low threshold → false opens during normal tail latency,
too high threshold → slow dependency already harms caller,
not aligned with timeout → slow call may be counted only after timeout.

slowCallDurationThreshold should relate to operation latency budget.

11. Half-Open Probes

After the breaker has been open for some time, it moves to half-open.

Half-open allows limited test calls.

Important settings:

Setting	Meaning
`waitDurationInOpenState`	How long to stay open before probing
`permittedNumberOfCallsInHalfOpenState`	Number of probe calls allowed
`maxWaitDurationInHalfOpenState`	Avoid staying half-open forever
automatic transition	Whether breaker transitions without incoming calls

Half-open probe volume must be small.

If you allow too many half-open calls, a recovering dependency can be hit by a surge.

12. Choosing Wait Duration

If wait duration is too short, the breaker hammers a dependency that is still down.

If too long, recovery is delayed.

Starting points:

Dependency type	Starting wait duration
fast internal service	5–30 seconds
overloaded service	10–60 seconds
external provider	30 seconds–minutes
database-backed critical service	depends on failover/recovery time
batch/background dependency	longer is acceptable

Use real incident data:

deploy restart time,
leader failover time,
dependency autoscaling time,
cache warmup time,
database recovery time.

Circuit breaker timing should reflect how dependencies actually recover.

13. Circuit Breaker Granularity

Do not create one global breaker for everything.

Bad:

case-service circuit breaker

If searchCases fails, getCaseById also gets blocked.

Better:

case-service.getCaseById
case-service.searchCases
case-service.createEscalation

Granularity choices:

Granularity	Pros	Cons
per dependency	simple	unrelated operations affect each other
per operation	good default	more config/metrics
per dependency + operation + tenant	precise	high cardinality risk
per endpoint path	maps to HTTP	path templates needed
per caller/provider pair	useful platform view	config complexity

Default:

one circuit breaker per dependency operation

Avoid dynamic breaker names using IDs, tenants, users, or raw URLs.

That creates cardinality explosion.

14. Circuit Breaker and Fallback

When breaker opens, what should happen?

Options:

Strategy	Use when
fail fast	command cannot proceed safely
return stale cache	read can tolerate staleness
omit optional enrichment	dependency is non-critical
enqueue async work	command can be deferred
return degraded response	user can proceed with partial data
use alternate provider	safe alternate exists

Fallback must be semantically valid.

Bad fallback:

payment provider unavailable -> pretend payment succeeded

Good fallback:

recommendation service unavailable -> return default ranking

For regulatory/case-management systems, be especially careful.

Failing closed is often safer than pretending success.

15. Resilience4j Basic Usage

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
    .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
    .slidingWindowSize(100)
    .minimumNumberOfCalls(50)
    .failureRateThreshold(50.0f)
    .slowCallDurationThreshold(Duration.ofMillis(500))
    .slowCallRateThreshold(50.0f)
    .waitDurationInOpenState(Duration.ofSeconds(20))
    .permittedNumberOfCallsInHalfOpenState(5)
    .recordException(throwable -> failureClassifier.shouldRecordFailure(throwable))
    .ignoreException(throwable -> failureClassifier.shouldIgnore(throwable))
    .build();

CircuitBreaker breaker = CircuitBreaker.of("case-service.createEscalation", config);

Supplier<EscalationId> decorated =
    CircuitBreaker.decorateSupplier(breaker, () -> callCaseService(command));

EscalationId result = decorated.get();

This is the mechanical part.

The real work is choosing names, thresholds, classification, composition, fallback, and alerts.

16. Spring Boot Configuration Style

Conceptual configuration:

resilience4j:
  circuitbreaker:
    instances:
      caseServiceCreateEscalation:
        slidingWindowType: COUNT_BASED
        slidingWindowSize: 100
        minimumNumberOfCalls: 50
        failureRateThreshold: 50
        slowCallDurationThreshold: 500ms
        slowCallRateThreshold: 50
        waitDurationInOpenState: 20s
        permittedNumberOfCallsInHalfOpenState: 5
        automaticTransitionFromOpenToHalfOpenEnabled: true

Configuration should be owned like production policy.

Do not bury thresholds inside annotations without review.

17. Annotation Convenience and Its Trap

Spring annotation style can be convenient:

@CircuitBreaker(name = "caseServiceCreateEscalation", fallbackMethod = "fallback")
public EscalationId createEscalation(CreateEscalationCommand command) {
    return callRemote(command);
}

But annotation use can hide:

decorator ordering,
retry interaction,
exception mapping,
idempotency policy,
fallback semantics,
operation naming,
metrics labels.

For critical service-to-service communication, explicit client adapter composition is often clearer.

Annotation is acceptable when policy is simple and centrally configured.

18. Decorator Ordering

Composition matters.

Example:

Supplier<Response> supplier = () -> remoteCall();

Supplier<Response> decorated =
    Decorators.ofSupplier(supplier)
        .withBulkhead(bulkhead)
        .withTimeLimiter(timeLimiter, scheduler)
        .withRetry(retry)
        .withCircuitBreaker(circuitBreaker)
        .decorate();

But the meaning depends on ordering.

Questions:

Should breaker see each retry attempt or the final logical call?
Should bulkhead count retries separately?
Should timeout apply per attempt or whole logical call?
Should fallback happen after breaker open or after retry exhaustion?
Should rate limiter limit original calls or attempts?

There is no universal order.

You must decide and test.

Common practical approach for synchronous dependency operation:

rate limit
→ bulkhead
→ circuit breaker
→ retry with deadline awareness
→ timeout per attempt
→ remote call

But some teams place retry outside breaker so breaker sees individual attempt failures.

The key is not memorizing one order.

The key is knowing what each order means.

19. Circuit Breaker and Retry Ordering

Option A — Breaker outside retry

CircuitBreaker(Retry(Call))

Breaker sees one result after retries.

Pros:

breaker represents logical operation success/failure,
transient failures hidden by retry do not open breaker quickly,
less sensitive.

Cons:

dependency may receive more attempts before breaker reacts,
sustained failure may be detected later.

Option B — Retry outside breaker

Retry(CircuitBreaker(Call))

Breaker sees each attempt.

Pros:

breaker reacts faster,
protects dependency sooner.

Cons:

breaker may open from transient attempt failures,
retry may immediately hit open breaker,
metrics need careful interpretation.

For user-facing command APIs, I often prefer:

limited retry inside logical operation,
breaker records final outcome plus slow-call metrics

For high-volume low-latency reads, attempt-level breaker can be acceptable.

Test with failure simulations.

20. Circuit Breaker and Bulkhead Ordering

If bulkhead is outside breaker:

Bulkhead(CircuitBreaker(Call))

Then calls rejected by bulkhead do not reach breaker.

Good: breaker reflects dependency health, not local saturation.

If breaker is outside bulkhead:

CircuitBreaker(Bulkhead(Call))

Then bulkhead rejections may be counted as breaker failures depending on config.

This can open breaker due to caller-side capacity saturation, not dependency failure.

Default recommendation:

bulkhead outside dependency breaker

and do not count BulkheadFullException as dependency failure unless deliberately modeling end-to-end operation health.

21. Circuit Breaker and Timeout

Timeout should happen before breaker records outcome.

If a remote call exceeds timeout:

timeout aborts attempt,
breaker records failure or slow call,
retry policy decides next attempt,
fallback/fail-fast handles final result.

But distinguish:

Timeout type	Count in breaker?
remote response timeout	Yes
connect timeout	Yes
TLS timeout	Yes
pool acquisition timeout	Usually no for dependency health
deadline exceeded before call starts	No remote call happened
bulkhead queue timeout	Usually no for dependency health

This classification matters.

22. Circuit Breaker and Commands

For side-effecting commands, circuit breaker open means:

Do not call dependency.

What should the caller do?

Options:

return 503 to upstream,
enqueue command for later processing,
fail workflow step and retry later,
use alternate route,
block only non-critical operation,
degrade UI.

Do not silently drop commands.

Do not pretend success.

For commands, circuit breaker protects the caller from wasting resources, but business correctness still depends on:

idempotency,
deduplication,
outbox,
reconciliation,
durable workflow state.

23. Circuit Breaker and Reads

Reads often have safer fallbacks.

Examples:

cache fallback,
stale read model,
partial response,
default configuration,
previous known value.

But stale fallback must be explicit.

Example response metadata:

{
  "caseId": "CASE-100",
  "status": "OPEN",
  "dataFreshness": {
    "source": "cache",
    "cachedAt": "2026-07-05T10:15:30Z",
    "stale": true
  }
}

Do not hide stale data if consumers need strong freshness.

24. Circuit Breaker Metrics

Minimum metrics:

resilience4j.circuitbreaker.state{name}
resilience4j.circuitbreaker.calls{name,kind}
resilience4j.circuitbreaker.failure.rate{name}
resilience4j.circuitbreaker.slow.call.rate{name}
resilience4j.circuitbreaker.buffered.calls{name}
resilience4j.circuitbreaker.not.permitted.calls{name}

Operational dashboard should show:

breaker state over time,
call volume,
failure rate,
slow call rate,
not-permitted calls,
dependency latency,
timeout rate,
retry rate,
fallback rate,
upstream error rate.

A breaker opening is not always bad.

It may be protecting the system correctly.

The dashboard should show whether user impact is contained.

25. Alerts

Good alerts:

Alert	Meaning
breaker open for critical dependency	dependency outage or sustained slowness
not-permitted calls high	traffic being failed fast
breaker flapping	thresholds/wait duration unstable or dependency unstable
slow call rate rising	early degradation
fallback rate rising	degraded mode active
open breaker plus retry surge	retry policy may be misordered
breaker never opens despite timeouts	classifier/config wrong
breaker opens with low traffic	minimum calls/window too low

Avoid alerting on every state transition.

Alert on sustained or high-impact states.

26. Circuit Breaker Events

Resilience4j exposes events such as:

state transition,
success,
error,
ignored error,
slow call,
call not permitted.

Use events for logs and diagnostics.

Example:

breaker.getEventPublisher()
    .onStateTransition(event -> logger.warn(
        "Circuit breaker state changed name={} transition={}",
        event.getCircuitBreakerName(),
        event.getStateTransition()
    ))
    .onCallNotPermitted(event -> metrics.incrementNotPermitted(event.getCircuitBreakerName()));

Do not log every success/error event in high-volume systems.

Use metrics for volume.

Use logs for state changes and unusual events.

27. Testing Circuit Breaker Behavior

Minimum tests:

Scenario	Expected behavior
enough failures exceed threshold	breaker opens
failures below minimum calls	breaker does not open
ignored exception	not counted as failure
slow calls exceed threshold	breaker opens
open breaker	remote call not invoked
after wait duration	limited half-open probes allowed
half-open success	breaker closes
half-open failure	breaker reopens
fallback on open	correct degraded/fail-fast behavior
metrics emitted	state and not-permitted visible

Example conceptual test:

@Test
void opensAfterFailureRateThreshold() {
    CircuitBreaker breaker = CircuitBreaker.of("test", CircuitBreakerConfig.custom()
        .slidingWindowSize(10)
        .minimumNumberOfCalls(10)
        .failureRateThreshold(50)
        .build());

    Supplier<String> failing = CircuitBreaker.decorateSupplier(
        breaker,
        () -> { throw new RemoteDependencyUnavailableException(); }
    );

    for (int i = 0; i < 10; i++) {
        assertThatThrownBy(failing::get).isInstanceOf(RuntimeException.class);
    }

    assertThat(breaker.getState()).isEqualTo(CircuitBreaker.State.OPEN);
}

Half-open test:

@Test
void closesAfterSuccessfulHalfOpenProbes() {
    // Use a test clock or very short waitDurationInOpenState.
    // Force breaker open, wait, allow permitted probe calls, then assert CLOSED.
}

Use deterministic configs in tests.

Do not make unit tests sleep for real production durations.

28. Chaos and Load Testing

Circuit breaker behavior should be verified under realistic failure.

Test cases:

dependency returns 503 for 60 seconds,
dependency latency jumps to 2 seconds,
10% random connection resets,
gateway timeout spike,
dependency partially recovers,
one operation fails while another remains healthy,
half-open probe surge,
retry + breaker interaction,
fallback cache under load,
low-traffic operation threshold behavior.

Questions to answer:

Does breaker open when expected?
Does it prevent connection/thread exhaustion?
Does it flap?
Are fallbacks safe?
Do retries stop when breaker opens?
Do alerts fire correctly?
Does recovery happen without a thundering herd?

29. Production Policy Template

dependencies:
  case-service:
    operations:
      getCase:
        circuitBreaker:
          enabled: true
          slidingWindowType: TIME_BASED
          slidingWindowSizeSeconds: 30
          minimumNumberOfCalls: 100
          failureRateThreshold: 50
          slowCallDurationThresholdMs: 300
          slowCallRateThreshold: 50
          waitDurationInOpenStateSeconds: 15
          permittedCallsInHalfOpenState: 10
          recordFailures:
            - CONNECT_TIMEOUT
            - READ_TIMEOUT
            - HTTP_502
            - HTTP_503
            - HTTP_504
          ignoreFailures:
            - HTTP_400
            - HTTP_401
            - HTTP_403
            - HTTP_404
            - HTTP_409_DOMAIN_CONFLICT
            - HTTP_422
          fallback: stale-cache-if-fresh-enough

      createEscalation:
        circuitBreaker:
          enabled: true
          slidingWindowType: COUNT_BASED
          slidingWindowSize: 100
          minimumNumberOfCalls: 50
          failureRateThreshold: 40
          slowCallDurationThresholdMs: 600
          slowCallRateThreshold: 60
          waitDurationInOpenStateSeconds: 30
          permittedCallsInHalfOpenState: 5
          fallback: fail-fast-503

Policy should be:

visible,
versioned,
reviewed,
tested,
connected to dashboards,
aligned with timeout/retry/bulkhead policy.

30. Common Anti-Patterns

30.1 One breaker for all operations

A slow search endpoint opens the breaker for a fast lookup endpoint.

30.2 Counting caller errors as dependency failures

Bad request traffic opens the dependency breaker.

30.3 Too-low minimum calls

Breaker opens from tiny samples.

30.4 Too-high minimum calls

Breaker reacts too late.

30.5 No slow-call threshold

Dependency becomes very slow but breaker remains closed until hard failures.

30.6 Breaker without timeout

Calls hang too long before breaker gets evidence.

30.7 Breaker without bulkhead

Even while breaker is closed, slow calls can exhaust caller resources.

30.8 Fallback that lies

Returning fake success for a failed command corrupts business state.

30.9 Hidden annotation policy

Critical communication behavior is invisible in code review.

30.10 No alert on open breaker

Breaker protects system, but nobody knows degradation is active.

31. Decision Model

Circuit breaker is useful only when the service can classify outcomes and define safe behavior when calls are blocked.

32. Design Checklist

Before enabling a circuit breaker:

What dependency and operation does it protect?
What is the breaker name?
Is the name low-cardinality?
Which failures count?
Which failures are ignored?
Are slow calls counted?
What is slow-call threshold?
What is failure-rate threshold?
What is sliding window type and size?
What is minimum number of calls?
What is wait duration in open state?
How many half-open probes are allowed?
What fallback or fail-fast behavior applies?
How does it compose with retry?
How does it compose with timeout?
How does it compose with bulkhead?
Are commands idempotent if retry exists?
Are metrics and alerts configured?
Are half-open and recovery tested?
Is config documented in runbook?

33. The Real Lesson

Circuit breaker is not about being clever.

It is about refusing to keep doing something that is already known to be harmful.

A production Java microservice uses circuit breakers to preserve:

caller capacity,
dependency recovery time,
predictable failure,
observable degradation,
user-facing containment.

The breaker is not the resilience strategy.

It is one containment boundary inside a larger strategy:

timeout
+ retry with budget
+ circuit breaker
+ bulkhead
+ fallback/load shedding
+ observability

That is how synchronous communication fails safely instead of failing everywhere.

References

Resilience4j CircuitBreaker documentation: https://resilience4j.readme.io/docs/circuitbreaker
Resilience4j Getting Started: https://resilience4j.readme.io/docs/getting-started
Martin Fowler — Circuit Breaker: https://martinfowler.com/bliki/CircuitBreaker.html
Google SRE Book — Addressing Cascading Failures: https://sre.google/sre-book/addressing-cascading-failures/
Google SRE Book — Production Services Best Practices: https://sre.google/sre-book/service-best-practices/
AWS Builders Library — Timeouts, retries, and backoff with jitter: https://aws.amazon.com/builders-library/timeouts-retries-and-backoff-with-jitter/

Lesson Recap

You just completed lesson 41 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 40

Retry Design: Backoff, Jitter, Budgets, and Retry Storms

Next Lesson

Lesson 42

Bulkhead Isolation: Thread, Semaphore, Queue, Pool