Series MapLesson 14 / 35
Build CoreOrdered learning track

Learn Java Error Reliability Observability Part 014 Circuit Breaker Bulkhead Ratelimit

13 min read2401 words
PrevNext
Lesson 1435 lesson track0719 Build Core

title: Learn Java Error, Reliability & Observability Engineering - Part 014 description: Circuit breaker, bulkhead, rate limiter, load shedding, failure isolation, and resilience policy design for production Java services. series: learn-java-error-reliability-observability seriesTitle: Learn Java Error, Reliability & Observability Engineering order: 14 partTitle: Circuit Breaker, Bulkhead & Rate Limit tags:

  • java
  • reliability
  • resilience
  • circuit-breaker
  • bulkhead
  • rate-limit
  • resilience4j
  • observability date: 2026-06-28

Part 014 — Circuit Breaker, Bulkhead & Rate Limit

Target Pembelajaran

Setelah part ini, kamu harus bisa mendesain protection layer untuk Java service:

  1. Circuit breaker untuk menghentikan call ke dependency yang sedang gagal atau lambat.
  2. Bulkhead untuk mencegah satu dependency menghabiskan semua thread/connection/resource.
  3. Rate limiter untuk mengontrol admission rate agar service tidak overload.
  4. Load shedding untuk menolak sebagian traffic secara sengaja ketika kapasitas tidak cukup.
  5. Telemetry untuk membuktikan protection layer bekerja.

Part sebelumnya membahas retry, timeout, dan idempotency. Part ini membahas pertanyaan berikut:

Kalau dependency sedang rusak, kapan kita berhenti mencoba?
Kalau satu dependency lambat, bagaimana mencegah seluruh service ikut macet?
Kalau traffic melebihi kapasitas, siapa yang harus ditolak dulu?

1. Mental Model: Reliability Control sebagai Governor

Retry mencoba memperbaiki kegagalan sementara. Tetapi retry tanpa governor bisa memperbesar kegagalan.

Circuit breaker, bulkhead, dan rate limiter adalah governor:

PatternPertanyaan yang dijawab
Circuit breakerHaruskah kita terus memanggil dependency ini?
BulkheadBerapa banyak resource yang boleh dipakai dependency ini?
Rate limiterBerapa banyak request yang boleh masuk per unit waktu?
Load sheddingRequest mana yang harus ditolak ketika kapasitas tidak cukup?

Urutan ini bukan hukum universal, tetapi mental model yang aman:

  1. Batasi admission.
  2. Batasi concurrency.
  3. Cek health dependency.
  4. Terapkan timeout.
  5. Retry hanya jika aman.

2. Problem: Cascading Failure

Cascading failure terjadi ketika failure pada satu bagian menaikkan tekanan pada bagian lain, sehingga failure menyebar.

Contoh:

Payment Service lambat
Order Service thread menunggu Payment
Thread pool Order habis
Order tidak bisa menjawab health check
Kubernetes restart Order
Traffic berpindah ke pod tersisa
Pod tersisa overload
Seluruh checkout down

Resilience pattern bertujuan memutus positive feedback loop ini.


3. Circuit Breaker

Circuit breaker mencegah sistem terus memanggil dependency yang kemungkinan besar gagal.

State

StateMakna
CLOSEDCall normal, metrics dikumpulkan
OPENCall ditolak cepat tanpa memanggil dependency
HALF_OPENSebagian kecil call diuji untuk melihat recovery

Circuit Breaker Bukan Timeout

Timeout menjawab:

Berapa lama satu call boleh menunggu?

Circuit breaker menjawab:

Apakah call berikutnya layak dilakukan?

Circuit Breaker Bukan Retry

Retry menambah attempt. Circuit breaker mengurangi attempt.

Gabungan yang salah bisa berbahaya:

Circuit breaker open, tetapi retry tetap mencoba berkali-kali

Gabungan yang benar:

Jika breaker open, fail fast.
Retry hanya untuk failure yang melewati breaker dan aman diulang.

4. Circuit Breaker Metrics

Breaker biasanya memakai sliding window.

MetricMakna
Failure ratePersentase call gagal
Slow call ratePersentase call yang lebih lambat dari threshold
Minimum callsJumlah minimum call sebelum rate dihitung
Sliding windowWindow count/time untuk menghitung rate
Open durationBerapa lama breaker tetap open
Permitted half-open callsJumlah probe saat recovery

Contoh Policy

dependency: payment-service
minimumCalls: 50
slidingWindow: 60s
failureRateThreshold: 50%
slowCallDurationThreshold: 500ms
slowCallRateThreshold: 60%
openStateDuration: 30s
halfOpenPermittedCalls: 5

Interpretasi:

Jika minimal ada 50 call, dan lebih dari 50% gagal,
atau lebih dari 60% call lebih lambat dari 500ms,
breaker open selama 30s.
Setelah itu, 5 call probe menentukan recovery.

5. Circuit Breaker Classification

Tidak semua exception harus dihitung sebagai breaker failure.

FailureCount as failure?Alasan
Connect timeoutYesDependency unreachable
Read timeoutYesDependency too slow/unknown
500/502/503/504Usually yesUpstream failure
429Maybe separateThrottling, bukan selalu health failure
400 validationNoCaller error
401/403No or security metricAuth/config issue
404 domain not foundNoNormal business outcome
Domain rejectionNoBukan dependency failure
Circuit open rejectionNo as dependency failureIni local protection outcome

Jika business rejection dihitung sebagai failure, breaker bisa open karena traffic valid yang ditolak domain. Itu salah.


6. Resilience4j Circuit Breaker Example

Resilience4j menyediakan decorator untuk Circuit Breaker, Retry, RateLimiter, Bulkhead, dan lain-lain.

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
        .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.TIME_BASED)
        .slidingWindowSize(60)
        .minimumNumberOfCalls(50)
        .failureRateThreshold(50.0f)
        .slowCallDurationThreshold(Duration.ofMillis(500))
        .slowCallRateThreshold(60.0f)
        .waitDurationInOpenState(Duration.ofSeconds(30))
        .permittedNumberOfCallsInHalfOpenState(5)
        .recordException(this::shouldRecordAsDependencyFailure)
        .build();

CircuitBreaker breaker = CircuitBreaker.of("payment-service", config);

Supplier<PaymentResponse> protectedCall = CircuitBreaker
        .decorateSupplier(breaker, () -> paymentClient.createPayment(command));

PaymentResponse response = protectedCall.get();

Classifier:

private boolean shouldRecordAsDependencyFailure(Throwable failure) {
    if (failure instanceof DomainRejectionException) {
        return false;
    }
    if (failure instanceof ValidationException) {
        return false;
    }
    if (failure instanceof DependencyTimeoutException) {
        return true;
    }
    if (failure instanceof DependencyUnavailableException) {
        return true;
    }
    return true;
}

Default production stance:

Record infrastructure/dependency failure.
Ignore expected domain/client failures.

7. Fallback

Fallback bukan default kosong asal jalan. Fallback adalah business decision.

ScenarioPossible fallback
Recommendation service downReturn popular items
Fraud scoring unavailableQueue for manual review or block high-risk operation
Price service unavailableUse cached price if freshness acceptable
Notification service downPersist outbox for later
Payment provider downOffer retry later
Identity provider unavailableFail closed for privileged action

Fallback Risk

Fallback bisa melanggar invariant.

Contoh buruk:

catch (Exception e) {
    return FraudDecision.APPROVED;
}

Jika fraud service gagal, approve semua transaksi adalah failure policy yang berbahaya.

Contoh lebih baik:

catch (CallNotPermittedException e) {
    return FraudDecision.manualReview("fraud service circuit open");
}

8. Bulkhead

Bulkhead membatasi resource yang bisa dikonsumsi satu dependency atau satu workload class.

Analogi kapal: sekat mencegah satu ruang bocor menenggelamkan seluruh kapal.

Tanpa bulkhead, dependency lambat bisa menghabiskan semua thread.

Bulkhead Types

TypeMaknaCocok untuk
Semaphore bulkheadMembatasi concurrent callsVirtual threads, non-blocking-ish calls, simple isolation
Thread-pool bulkheadDedicated thread pool + queueBlocking dependency, legacy client
Connection pool limitMembatasi koneksiDB/HTTP clients
Queue limitMembatasi backlogWorker/job processing
Tenant bulkheadMembatasi per tenantMulti-tenant platform
Priority bulkheadPisahkan critical vs backgroundReliability under load

9. Bulkhead dan Virtual Threads

Virtual threads mengurangi cost blocking thread, tetapi tidak menghilangkan kebutuhan bulkhead.

Tanpa bulkhead, virtual threads bisa membuat sistem membuat sangat banyak concurrent call ke dependency sampai dependency collapse.

Virtual threads solve thread scarcity.
They do not solve downstream capacity.

Tetap batasi:

  1. Concurrent calls ke dependency.
  2. DB connections.
  3. HTTP connections.
  4. In-flight requests per tenant.
  5. Queue depth.
  6. Memory usage.

10. Resilience4j Bulkhead Example

Semaphore bulkhead:

BulkheadConfig config = BulkheadConfig.custom()
        .maxConcurrentCalls(20)
        .maxWaitDuration(Duration.ofMillis(50))
        .build();

Bulkhead bulkhead = Bulkhead.of("payment-service", config);

Supplier<PaymentResponse> protectedCall = Bulkhead.decorateSupplier(
        bulkhead,
        () -> paymentClient.createPayment(command)
);

PaymentResponse response = protectedCall.get();

Interpretasi:

Maksimal 20 concurrent calls.
Jika penuh, tunggu maksimal 50ms.
Jika tetap penuh, reject cepat.

Reject Cepat Lebih Baik daripada Queue Tak Terbatas

Queue panjang sering membuat latency makin buruk.

Queue is not capacity.
Queue is delayed pain.

Gunakan queue hanya jika:

  1. Workload bisa menunggu.
  2. Ada deadline.
  3. Ada max queue size.
  4. Ada observability.
  5. Ada cancellation/drop policy.

11. Bulkhead Sizing

Salah satu cara sizing awal:

maxConcurrentCalls = dependency_capacity_per_instance * safe_fraction

Atau gunakan Little's Law secara kasar:

concurrency = throughput * latency

Jika target call Payment:

throughput: 100 req/s
p95 latency: 100ms = 0.1s
needed concurrency: 100 * 0.1 = 10

Tambahkan buffer:

bulkhead = 15-20

Jika latency spike ke 1s:

100 * 1 = 100 concurrent calls

Tanpa bulkhead, thread/connections akan naik drastis. Dengan bulkhead 20, sisanya ditolak cepat sehingga service tetap hidup.


12. Rate Limiter

Rate limiter mengontrol jumlah request per waktu.

PatternMental model
Fixed windowN request per window
Sliding windowWindow bergerak lebih halus
Token bucketToken refill periodik; burst terbatas
Leaky bucketOutput rate stabil
Concurrency limiterBatasi in-flight, bukan request/sec

Rate limiter bisa diterapkan di:

  1. API gateway.
  2. Service instance.
  3. Client SDK.
  4. Worker consumer.
  5. Per tenant.
  6. Per user.
  7. Per dependency.

Local vs Distributed Rate Limit

TypeKelebihanRisiko
Local per instanceCepat, simpleTotal limit berubah saat scale out
DistributedGlobal fairnessLatency/availability dependency baru
Gateway-levelCentralizedTidak tahu semua business context
Application-levelContext-awareHarus konsisten antar service

13. Rate Limiter Example

RateLimiterConfig config = RateLimiterConfig.custom()
        .limitForPeriod(100)
        .limitRefreshPeriod(Duration.ofSeconds(1))
        .timeoutDuration(Duration.ofMillis(20))
        .build();

RateLimiter limiter = RateLimiter.of("case-create", config);

Supplier<CreateCaseResponse> limited = RateLimiter.decorateSupplier(
        limiter,
        () -> caseService.create(command)
);

CreateCaseResponse response = limited.get();

Interpretasi:

100 permission per second.
Jika tidak tersedia, tunggu maksimal 20ms.
Jika tetap tidak tersedia, reject.

Mapping response:

ContextResponse
Public API per-user limit429 Too Many Requests
Internal dependency protection503 Service Unavailable atau domain-specific failure
Background jobRequeue with delay
Batch importMark row as deferred
Admin operationExplain capacity constraint

14. Load Shedding

Rate limiting sering statis. Load shedding lebih dinamis: sistem menolak traffic ketika kapasitas aktual tidak cukup.

Sinyal load shedding:

  1. CPU terlalu tinggi.
  2. Heap pressure.
  3. GC pause tinggi.
  4. Queue depth tinggi.
  5. Thread pool saturated.
  6. DB pool exhausted.
  7. Downstream latency tinggi.
  8. Error budget burn tinggi.
  9. Kubernetes pod terminating.
  10. Deadline request terlalu dekat.

Priority Classes

PriorityExample
P0Health/safety/regulatory critical write
P1User-facing core flow
P2User-facing non-critical enrichment
P3Background sync
P4Analytics/recommendation

Top 1% engineer tidak menolak traffic secara random. Mereka mendesain policy fairness dan priority.


15. Composition: Retry, Timeout, Circuit Breaker, Bulkhead, Rate Limiter

Pattern bisa saling mengganggu jika urutannya salah.

RateLimiter -> Bulkhead -> CircuitBreaker -> TimeLimiter/Timeout -> Retry-aware call

Tetapi detail tergantung library dan semantics.

Pertanyaan desain:

PertanyaanImplikasi
Apakah retry harus mengakuisisi bulkhead permit per attempt?Biasanya ya, agar retry tidak bypass concurrency control
Apakah circuit breaker melihat setiap attempt atau whole operation?Untuk dependency health, breaker biasanya melihat attempt/call ke dependency
Apakah rate limiter menghitung retry?Ya, retry tetap traffic
Apakah timeout per attempt atau total?Butuh keduanya
Apakah fallback dipanggil saat breaker open?Bisa, jika fallback aman
Apakah bulkhead rejection di-retry?Biasanya tidak langsung; itu tanda local saturation

Dangerous Composition

Retry outside everything with maxAttempts=5
Each retry bypasses rate limiter
No total deadline

Hasil:

Local overload + retry storm.

Safer Composition

Total deadline wraps operation
Each attempt:
  acquire rate permission
  acquire bulkhead permit
  check circuit breaker
  call with timeout
Retry only if classifier says safe and budget remains

16. Resilience Decorator Example

Supplier<PaymentResponse> supplier = () -> paymentClient.createPayment(command);

Supplier<PaymentResponse> decorated =
        Decorators.ofSupplier(supplier)
                .withBulkhead(paymentBulkhead)
                .withCircuitBreaker(paymentCircuitBreaker)
                .withRetry(paymentRetry)
                .decorate();

PaymentResponse response = decorated.get();

Dalam production, jangan hanya copy order decorator. Verifikasi:

  1. Exception mana yang dilihat retry.
  2. Exception mana yang dihitung breaker.
  3. Apakah bulkhead permit dipegang selama retry atau per attempt.
  4. Apakah timeout diterapkan per attempt.
  5. Apakah metrics sesuai yang kamu harapkan.

Testing terhadap composition lebih penting daripada asumsi.


17. Policy Registry

Untuk sistem besar, jangan biarkan setiap team membuat angka sendiri tanpa review.

public record ResiliencePolicy(
        String dependency,
        Duration totalDeadline,
        int maxAttempts,
        Duration attemptTimeout,
        int maxConcurrentCalls,
        int rateLimitPerSecond,
        float failureRateThreshold,
        Duration slowCallThreshold
) {}

Contoh registry:

dependencies:
  payment-service:
    totalDeadline: 800ms
    attemptTimeout: 250ms
    maxAttempts: 2
    maxConcurrentCalls: 20
    rateLimitPerSecond: 100
    failureRateThreshold: 50
    slowCallThreshold: 500ms

  notification-service:
    totalDeadline: 300ms
    attemptTimeout: 150ms
    maxAttempts: 1
    maxConcurrentCalls: 10
    rateLimitPerSecond: 50
    failureRateThreshold: 60
    slowCallThreshold: 250ms

Policy registry membantu:

  1. Review architecture.
  2. Audit.
  3. Incident analysis.
  4. Consistency.
  5. Prevent config drift.

18. Observability untuk Protection Layer

Metrics

MetricTypeMeaning
circuit_breaker_stategaugeclosed/open/half-open
circuit_breaker_calls_totalcountersuccessful, failed, ignored, not_permitted
bulkhead_available_concurrent_callsgaugeremaining capacity
bulkhead_rejections_totalcounterlocal saturation
rate_limiter_permissions_totalcounterallowed/denied
dependency_slow_calls_totalcountercalls exceeding threshold
fallback_invocations_totalcounterfallback path used
load_shed_totalcounterintentionally rejected
retry_after_circuit_open_totalcountershould usually be zero

Logs

Circuit breaker open event:

{
  "event": "circuit_breaker_opened",
  "dependency": "payment-service",
  "failureRate": 64.0,
  "slowCallRate": 72.0,
  "minimumCalls": 50,
  "window": "60s",
  "correlationId": "corr-123"
}

Bulkhead rejection:

{
  "event": "bulkhead_rejected",
  "dependency": "payment-service",
  "maxConcurrentCalls": 20,
  "availableConcurrentCalls": 0,
  "operation": "createPayment"
}

Rate limit rejection:

{
  "event": "rate_limited",
  "limitName": "case-create",
  "limitForPeriod": 100,
  "refreshPeriodMs": 1000,
  "actor": "tenant-123"
}

Trace

Trace harus menunjukkan apakah request gagal karena:

  1. Dependency real failure.
  2. Circuit breaker open.
  3. Bulkhead full.
  4. Rate limit exceeded.
  5. Fallback applied.
  6. Deadline exceeded.

Itu semua outcome yang berbeda, bukan “500”.


19. Alerting

Alert buruk:

Circuit breaker opened once.

Breaker open bisa berarti protection bekerja.

Alert lebih baik:

Payment circuit breaker open for > 5 minutes AND checkout success rate below SLO.

Atau:

Bulkhead rejection > 5% for core operation for 10 minutes.

Atau:

Rate limit denial for P0 traffic > 0.

Alert Classes

AlertSeverity
Breaker open for optional dependencyLow/medium
Breaker open for critical dependency with user impactHigh
Bulkhead rejection for background jobLow
Bulkhead rejection for core APIHigh
Rate limiting abusive tenantInformational/security
Rate limiting all tenantsCapacity incident
Fallback invoked for safe stale cacheMedium
Fallback fail-open for security decisionCritical

20. Testing Strategy

Unit Test

Test classifier:

@Test
void validationExceptionShouldNotTripBreaker() {
    assertFalse(classifier.shouldRecordAsDependencyFailure(
            new ValidationException("bad input")
    ));
}

@Test
void dependencyTimeoutShouldTripBreaker() {
    assertTrue(classifier.shouldRecordAsDependencyFailure(
            new DependencyTimeoutException("timeout")
    ));
}

Integration Test

Simulate dependency returning 503:

Given dependency returns 503 for 50 calls
When client calls operation
Then circuit breaker opens
And subsequent calls fail fast
And dependency receives no call while open

Load Test

Validate:

  1. Bulkhead caps concurrency.
  2. Queue does not grow unbounded.
  3. Rate limiter rejects beyond configured rate.
  4. Breaker opens under dependency failure.
  5. Retry does not bypass rate limit.
  6. Fallback does not violate domain invariant.

Chaos Test

Inject:

  1. 2s latency spike.
  2. 50% 500 response.
  3. Connection refused.
  4. Partial timeout.
  5. Slow success.
  6. Dependency recovery after 30s.

Observe if system recovers without restart.


21. Regulatory/Case Management Angle

Untuk enforcement lifecycle atau complex case management platform, resilience policy bukan cuma uptime. Ia mempengaruhi defensibility.

Contoh:

OperationFailure policy
Create enforcement casePrefer durable intent + idempotency; do not silently drop
Assign investigatorRetry optimistic conflict if command still valid
Notify regulated partyOutbox + retry + audit trail
Check sanctions/risk listFail closed or manual review depending policy
Generate audit reportDefer if dependency unavailable
Save decisionMust not fallback to default approval
Publish enforcement actionIdempotent publication key and audit evidence

Protection layer harus menyimpan evidence:

why rejected?
why deferred?
why manual review?
which dependency failed?
which retry attempts happened?
was decision automatic or fallback?

22. Common Anti-Patterns

22.1 Circuit Breaker sebagai Pengganti Timeout

Breaker tidak akan menyelamatkan call pertama yang menggantung tanpa timeout.

22.2 Breaker Menghitung Business Rejection

Domain rejection bukan bukti dependency rusak.

22.3 Bulkhead Terlalu Besar

Bulkhead yang sama besar dengan total thread pool tidak mengisolasi apa pun.

22.4 Queue Tak Terbatas

Queue tak terbatas membuat latency tak terbatas dan memory pressure.

22.5 Fallback Melanggar Domain

Fallback yang membuat keputusan bisnis tanpa evidence bisa lebih buruk daripada failure.

22.6 Rate Limit Tanpa Fairness

Tenant besar bisa menghabiskan kapasitas dan membuat tenant kecil ikut gagal.

22.7 Alert Setiap Breaker Open

Ini menciptakan alert fatigue. Alert harus berbasis user impact dan durasi.

22.8 Retry Saat Circuit Open

Jika breaker open, dependency sedang dilindungi. Retry lokal hanya membuat noise.


23. Design Checklist

[ ] Apakah setiap dependency punya circuit breaker?
[ ] Apakah failure classifier membedakan technical failure dan domain rejection?
[ ] Apakah slow call ikut dihitung?
[ ] Apakah threshold berdasarkan latency/SLO, bukan angka asal?
[ ] Apakah half-open probe dibatasi?
[ ] Apakah fallback aman secara domain?
[ ] Apakah setiap dependency punya bulkhead/concurrency limit?
[ ] Apakah queue bounded?
[ ] Apakah rate limiter punya scope: user, tenant, operation, dependency?
[ ] Apakah retry dihitung dalam rate limit?
[ ] Apakah bulkhead rejection tidak di-retry secara membabi buta?
[ ] Apakah load shedding mempertimbangkan priority?
[ ] Apakah metrics breaker/bulkhead/rate-limit tersedia?
[ ] Apakah dashboard membedakan fail-fast dan dependency failure?
[ ] Apakah alert berbasis impact?
[ ] Apakah policy terdokumentasi dan direview?

24. Latihan Praktik

Latihan 1 — Dependency Protection Map

Pilih satu service. Buat table:

dependency | criticality | timeout | retry | breaker | bulkhead | rate limit | fallback

Minimal 10 dependency/operation.

Latihan 2 — Circuit Breaker Drill

Buat fake dependency:

  1. 100% sukses.
  2. 70% gagal.
  3. 100% lambat.
  4. Recover setelah 30 detik.

Pastikan breaker:

  1. Closed saat sehat.
  2. Open saat threshold lewat.
  3. Half-open setelah wait duration.
  4. Closed kembali setelah probe sukses.

Latihan 3 — Bulkhead Saturation

Simulasikan 100 concurrent call ke dependency lambat dengan bulkhead 10.

Pastikan:

  1. Hanya 10 concurrent call masuk.
  2. Sisanya reject cepat atau wait sesuai max wait.
  3. Service endpoint lain tetap responsif.

Latihan 4 — Rate Limit Fairness

Implement per-tenant rate limit:

tenant A: 100 req/s
tenant B: 10 req/s
tenant C: 10 req/s

Simulasikan tenant A flood. Pastikan B dan C tetap mendapat kapasitas.

Latihan 5 — Observability Review

Buat dashboard yang menampilkan:

breaker state
not permitted calls
bulkhead rejection
rate limit denial
fallback invocation
dependency latency
user-visible failure

Tugas: dari dashboard saja, jelaskan apakah sistem sedang rusak, dilindungi, atau salah konfigurasi.


25. Top 1% Mental Model

Engineer biasa berkata:

"Tambahkan circuit breaker."

Engineer kuat bertanya:

"Failure apa yang dihitung breaker?"
"Apakah slow call lebih berbahaya daripada error?"
"Apakah fallback aman secara domain?"
"Apakah bulkhead benar-benar mengisolasi resource?"
"Apakah retry bypass rate limiter?"
"Apakah breaker open adalah masalah atau justru perlindungan?"
"Bagaimana user impact dibedakan dari protection event?"

Circuit breaker, bulkhead, dan rate limiter bukan dekorator library. Mereka adalah policy eksplisit tentang bagaimana sistem mempertahankan stabilitas ketika dependency, traffic, dan kapasitas tidak lagi ideal.


References

  • Resilience4j Documentation — CircuitBreaker, Bulkhead, RateLimiter, Retry, TimeLimiter
  • Google SRE Book — Addressing Cascading Failures
  • Google SRE Book — Production Services Best Practices
  • AWS Well-Architected Reliability Pillar — Control and limit retry calls
  • AWS Builders Library — Timeouts, retries, and backoff with jitter
Lesson Recap

You just completed lesson 14 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.