Deepen PracticeOrdered learning track

Learn Aws Part 023 Observability Cloudwatch Xray Opentelemetry And Slo

[]17 min read3374 words

In This Lesson

1. Kaufman Skill Map 2. Mental Model: Observability Is a Runtime Contract 3. AWS Observability Stack

Lesson 2335 lesson track20–29 Deepen Practice

title: Learn AWS Engineering Mastery - Part 023 description: Observability engineering on AWS using CloudWatch, X-Ray, OpenTelemetry, correlation IDs, metrics, logs, traces, alarms, dashboards, and SLO-driven operations. series: learn-aws seriesTitle: Learn AWS Engineering Mastery order: 23 partTitle: Observability: CloudWatch, X-Ray, OpenTelemetry, and SLO tags:

aws
cloudwatch
xray
opentelemetry
observability
sre
slo
operations
platform-engineering
series date: 2026-07-01

Observability: CloudWatch, X-Ray, OpenTelemetry, and SLO

Target pembelajaran: setelah bagian ini, kita mampu mendesain observability production-grade di AWS: bukan hanya membuat dashboard, tetapi membangun telemetry contract yang membantu engineer mendeteksi, menjelaskan, dan memperbaiki kegagalan sistem dengan cepat.

Observability sering direduksi menjadi “kita punya log” atau “kita punya dashboard CloudWatch”. Itu tidak cukup untuk sistem production-grade.

Observability yang baik menjawab pertanyaan operasional berikut:

Apakah user sedang terdampak?
Dampaknya sebesar apa?
Service mana yang menyebabkan degradasi?
Apakah masalahnya latency, error, saturation, throttling, dependency, deployment, data, atau quota?
Kapan mulai terjadi?
Perubahan apa yang terjadi sebelum gejala muncul?
Apakah rollback akan membantu?
Apakah mitigasi aman dilakukan?
Bukti apa yang bisa disimpan untuk post-incident review?

AWS menyediakan CloudWatch, CloudWatch Logs, CloudWatch Metrics, CloudWatch Alarms, CloudWatch Dashboards, CloudWatch Logs Insights, CloudWatch Embedded Metric Format, CloudWatch Agent, X-Ray, AWS Distro for OpenTelemetry, CloudTrail, EventBridge, AWS Health, Config, service-specific metrics, dan integrasi dengan banyak managed services.

Namun tool bukan inti utamanya. Inti observability adalah struktur informasi yang dapat dipakai saat sistem gagal.

1. Kaufman Skill Map

Kaufman-style deconstruction untuk observability:

Sub-skill	Yang harus dikuasai	Ukuran self-correction
Telemetry design	Memilih metric/log/trace/event yang benar	Alarm menjawab gejala user, bukan noise infrastructure
CloudWatch metrics	Namespace, dimension, statistic, period, alarm	Alarm dapat membedakan error rate, latency, dan saturation
Logging	Structured logs, retention, privacy, query	Engineer bisa menjawab “apa yang terjadi?” tanpa SSH
Tracing	Trace ID, span, downstream call, sampling	Engineer bisa melihat jalur request lintas service
SLO	SLI, objective, error budget, burn rate	Alert berbasis impact, bukan CPU semata
Dashboard	Layered dashboard: executive, service, dependency, infrastructure	Dashboard mempercepat diagnosis, bukan memperindah console
Governance	Retention, encryption, access, cost	Telemetry tidak bocor data sensitif dan tidak meledakkan biaya

2. Mental Model: Observability Is a Runtime Contract

Observability bukan fitur tambahan. Observability adalah kontrak runtime antara service dan operator.

Service production harus menerbitkan sinyal berikut:

Empat sinyal utama:

Signal	Kegunaan utama	Jangan dipakai untuk
Metrics	Alarm, trend, capacity, SLO	Detail forensic per request
Logs	Debug, forensic, audit, detail kejadian	High-cardinality time series sembarangan
Traces	Jalur request, dependency latency, causal chain	Audit long-term atau semua event bisnis
Events	Perubahan state, deployment, scaling, incident timeline	Debug detail aplikasi tanpa konteks

Rule sederhana:

Metrics menjawab: “apakah ada masalah?”
Logs menjawab: “apa yang terjadi?”
Traces menjawab: “di mana waktu habis atau error muncul?”
Events menjawab: “perubahan apa yang terjadi?”

Engineer top-tier tidak membuat semua sinyal menjadi log. Mereka memilih sinyal berdasarkan pertanyaan operasional.

3. AWS Observability Stack

CloudWatch adalah pusat observability umum di AWS. CloudWatch dapat memonitor resource AWS dan aplikasi secara real time, menyediakan metrics, alarms, dashboards, logs, agent, cross-account monitoring, OpenTelemetry support, dan fitur observability lain.

AWS observability tidak harus berarti semua data berhenti di CloudWatch. Banyak enterprise mengirim logs/traces ke SIEM, OpenSearch, Datadog, New Relic, Splunk, Grafana, atau data lake. Namun AWS-native baseline tetap penting karena:

Banyak service AWS menerbitkan metrics native ke CloudWatch.
CloudWatch alarms mudah dipakai sebagai trigger otomatis.
IAM/KMS/CloudTrail integration relatif matang.
Cross-account observability bisa menjadi baseline multi-account.
Banyak incident workflow AWS berangkat dari alarm dan event AWS-native.

4. Metrics: Alarmable Facts, Not Debug Text

Metric adalah angka time-series dengan timestamp, namespace, metric name, dimension, statistic, dan period.

Contoh metric yang baik:

Metric	Dimension	Kenapa baik
`RequestCount`	`Service`, `Environment`, `Route`	Mengukur traffic
`ErrorRate`	`Service`, `Environment`	Alarmable terhadap user impact
`LatencyP95`	`Service`, `Route`	Mengukur tail latency
`DependencyTimeoutCount`	`Service`, `Dependency`	Mengisolasi dependency failure
`QueueAgeSeconds`	`QueueName`, `Consumer`	Mengukur backlog freshness
`ThrottleCount`	`Service`, `Resource`	Menandai quota/capacity pressure

Metric yang buruk:

Metric	Masalah
`RequestId` sebagai dimension	Cardinality ekstrem, biaya naik, alarm tidak berguna
`UserId` sebagai dimension default	Cardinality dan privacy risk
`ExceptionMessage` sebagai dimension	Fragmentasi metric dan potensi PII
CPU sebagai satu-satunya alarm	Tidak selalu mewakili user impact

4.1 Namespace and Dimension Discipline

Rekomendasi namespace custom:

Company/Product/Service

Contoh:

Acme/EnforcementCase/WorkflowService

Contoh dimension minimal:

Environment=prod
Service=case-workflow
Region=ap-southeast-1

Tambahkan dimension hanya jika akan dipakai untuk diagnosis atau alarm.

4.2 Golden Signals

Untuk service online, gunakan golden signals:

Signal	AWS implementation
Latency	ALB TargetResponseTime, API Gateway Latency, custom p95/p99
Traffic	RequestCount, Count, Invocations, MessagesReceived
Errors	5XX, function errors, failed transitions, DLQ count
Saturation	CPU, memory, concurrency, connection pool, queue age, throttles

Untuk data pipeline:

Signal	Metric
Freshness	Age of latest processed event
Completeness	Expected vs processed records
Error	Failed records, DLQ entries
Throughput	Records/sec, bytes/sec
Lag	IteratorAge, consumer lag, queue age

Untuk workflow/case-management platform:

Signal	Metric
State transition success rate	Successful transitions / attempted transitions
Escalation latency	Time from trigger to escalation created
SLA breach risk	Cases nearing deadline
Stuck workflow count	Cases with no state change beyond threshold
Audit persistence failure	Failed audit append count

5. Logs: Structured Evidence for Debugging and Audit

CloudWatch Logs centralizes logs from systems, applications, and AWS services in a scalable service. In a production system, logs are not print statements. Logs are structured evidence.

Bad log:

Something failed

Good log:

{
  "timestamp": "2026-07-01T10:15:12.341Z",
  "level": "ERROR",
  "service": "case-workflow",
  "environment": "prod",
  "correlationId": "c-8e9b1",
  "caseIdHash": "h:2a9f...",
  "tenantId": "regulator-a",
  "operation": "transitionCaseState",
  "fromState": "UNDER_REVIEW",
  "toState": "ESCALATED",
  "errorType": "ConditionalWriteFailed",
  "dependency": "dynamodb",
  "durationMs": 184,
  "retryable": true
}

5.1 Logging Levels

Level	Meaning	Example
DEBUG	Development/temporary detailed data	Disabled by default in prod
INFO	Business/operational milestone	Case transitioned, job completed
WARN	Recoverable anomaly	Retry, fallback, degraded dependency
ERROR	Failed operation requiring attention	Request failed, audit append failed
FATAL/CRITICAL	Service cannot continue or severe data risk	Cannot load config, integrity breach

5.2 Log Events That Matter

Production logs should capture:

Request received and completed.
External dependency call and result.
State transition attempt and result.
Authorization decision failure.
Validation failure category, not raw sensitive payload.
Retry and final failure.
DLQ publication.
Idempotency conflict.
Data integrity violation.
Deployment version and configuration version.

5.3 Log Retention

Do not keep all logs forever by default.

Log type	Typical retention reasoning
Debug application logs	Short retention, e.g. 7-30 days
Production error logs	Medium retention, e.g. 30-90 days
Security/audit logs	Long retention based on compliance
Access logs	Depends on forensic and privacy policy
Regulated evidence logs	Explicit retention/legal policy

Important distinction:

Operational logs help run the service.
Audit records prove what happened.
Evidence records may need immutability and chain-of-custody controls.

Do not confuse ordinary logs with regulatory evidence.

6. Embedded Metric Format

CloudWatch Embedded Metric Format allows applications to emit structured log events that CloudWatch can extract into metrics. This is useful when we want logs and metrics from one event emission path.

Example:

{
  "_aws": {
    "Timestamp": 1782900912341,
    "CloudWatchMetrics": [
      {
        "Namespace": "Acme/CaseWorkflow",
        "Dimensions": [["Environment", "Service"]],
        "Metrics": [
          { "Name": "TransitionLatencyMs", "Unit": "Milliseconds" },
          { "Name": "TransitionFailure", "Unit": "Count" }
        ]
      }
    ]
  },
  "Environment": "prod",
  "Service": "case-workflow",
  "TransitionLatencyMs": 92,
  "TransitionFailure": 0,
  "correlationId": "c-8e9b1"
}

Use EMF when:

Application code owns the metric.
You want structured log + metric together.
Metric dimensions are controlled.
You avoid high-cardinality dimensions.

Avoid EMF when:

Every user/request becomes a metric dimension.
You need pure high-throughput metric ingestion without log retention cost.
You cannot control payload shape.

7. Tracing: Causal Path Across Services

Metrics tell us there is a problem. Logs tell us details. Traces show the path.

AWS X-Ray collects request data and helps visualize and analyze requests across applications and downstream dependencies. AWS Distro for OpenTelemetry can collect and send metrics/traces to AWS X-Ray, CloudWatch, OpenSearch, and other monitoring systems.

A useful trace should show:

Entry point.
Service boundaries.
Dependency calls.
Latency per hop.
Error per segment/span.
Retry behavior.
Trace/correlation ID.

7.1 X-Ray vs OpenTelemetry

Option	Strength	Trade-off
X-Ray SDK/native integration	AWS-native, service map, direct integration	More AWS-specific instrumentation model
OpenTelemetry/ADOT	Open standard, portable, vendor-flexible	Collector/configuration complexity
CloudWatch Agent OTLP	Consolidates metrics/traces ingestion path	Requires agent lifecycle management

7.2 Sampling

Tracing every request may be expensive. Sampling controls volume.

Sampling strategy:

Situation	Sampling approach
Low traffic critical service	Higher sampling
High traffic stable service	Lower default sampling
Error responses	Always sample or biased sampling
Canary deployment	Temporarily higher sampling
Incident investigation	Increase sampling with time limit

Failure mode: sampling only successful requests makes traces useless during incidents.

8. Correlation ID and Context Propagation

Every request crossing a boundary should carry context.

Minimum context:

correlationId
traceId
service
operation
tenantId or tenantHash
environment
requestId
actorType

Do not log raw sensitive identifiers when hashes or internal references are sufficient.

For async messaging, context must be placed in message attributes or event metadata. Without this, an incident timeline breaks at the queue boundary.

Recommended event envelope:

{
  "eventId": "evt-001",
  "eventType": "CaseEscalated",
  "occurredAt": "2026-07-01T10:15:12Z",
  "correlationId": "c-8e9b1",
  "traceId": "1-...",
  "producer": "case-workflow",
  "tenantId": "regulator-a",
  "schemaVersion": "1.0",
  "payload": {}
}

9. SLI, SLO, and Error Budget

A dashboard without SLO often becomes decoration. SLO converts telemetry into engineering commitment.

9.1 Definitions

Term	Meaning
SLI	Service Level Indicator; measurement
SLO	Service Level Objective; target
SLA	Legal/business agreement
Error budget	Allowed unreliability within SLO window

Example:

SLI: percentage of successful case state transitions completed under 500 ms
SLO: 99.5% over rolling 30 days
Error budget: 0.5% failed/slow transitions allowed

9.2 Good SLI Examples

Workload	SLI
API service	% valid requests that return non-5xx under latency threshold
Workflow engine	% state transitions committed successfully under threshold
Async worker	% messages processed before freshness deadline
Data pipeline	% partitions delivered complete before SLA deadline
Search service	% queries returning successful result under p95 threshold

9.3 Burn Rate

Burn rate measures how fast error budget is being consumed.

burn_rate = current_error_rate / allowed_error_rate

If SLO allows 0.1% error and current error is 1%, burn rate is 10x.

A good alert uses multiple windows:

Alert	Meaning
Fast burn	Severe current incident
Slow burn	Sustained degradation
Ticket alert	Needs action but not page
Dashboard-only	Informational trend

10. Alarm Design

Bad alarm:

CPU > 80% for 5 minutes

This may be useful, but it is not necessarily user impact.

Better alarm:

p95 latency > 800 ms AND 5xx rate > 2% for 5 minutes on prod API

Or:

ApproximateAgeOfOldestMessage > freshness target for 10 minutes

10.1 Alarm Severity

Severity	Condition	Response
Sev1	Broad user impact, data integrity risk, critical security issue	Page immediately
Sev2	Significant degradation, partial region/workload impact	Page team/on-call
Sev3	Degraded non-critical path or approaching quota	Ticket/working-hours response
Sev4	Trend or hygiene issue	Backlog

10.2 Symptom vs Cause Alarms

Use symptom alarms to page. Use cause alarms to diagnose.

Type	Example	Page?
Symptom	User request success rate below SLO	Yes
Cause	RDS CPU high	Usually no alone
Cause	Lambda throttles	Maybe if linked to user impact
Cause	Queue age high	Yes if freshness SLO violated
Cause	Disk usage high	Ticket unless imminent outage

10.3 Composite Alarms

Composite alarms reduce noise by combining signals:

ALARM if:
  API5xxHigh == ALARM
  AND RequestCountNormal == ALARM
  AND DeploymentInProgress != ALARM

This avoids paging for low traffic anomalies and supports deployment-aware alarms.

11. Dashboards That Actually Help

Use layered dashboards.

11.1 Executive/Service Health Dashboard

Shows:

Current SLO compliance.
Error budget remaining.
Active incidents.
User-impacting latency/error.
Business process throughput.

11.2 Service Owner Dashboard

Shows:

Request rate.
Error rate.
Latency p50/p95/p99.
Dependency error/latency.
Queue age.
Worker throughput.
Deployment version.
Throttling/quota.

11.3 Dependency Dashboard

Shows:

RDS/Aurora connections, CPU, replica lag, deadlocks.
DynamoDB throttles, consumed capacity, hot partition symptoms.
SQS age, visible/not visible messages, DLQ.
Lambda concurrency, errors, duration, throttles.
ALB target health, 5xx, response time.

11.4 Incident Dashboard

Shows only what an incident commander needs:

User impact.
Start time.
Deployment/change timeline.
Regional/AZ symptoms.
Dependency health.
Mitigation state.
Recovery trend.

Dashboard smell:

80 graphs and no clear answer.
No SLO.
No deployment marker.
No dependency view.
All p50, no p95/p99.
No tenant/environment dimension.

12. Workload-Specific Observability Patterns

12.1 Lambda

Minimum signals:

Signal	Why
Invocations	Traffic
Errors	Failure rate
Duration p95/p99	Latency and timeout risk
Throttles	Concurrency/capacity issue
ConcurrentExecutions	Saturation
IteratorAge	Stream lag
DLQ/destination failures	Async failure
Cold start count/custom metric	Runtime efficiency

Pattern:

12.2 ECS/EKS

Minimum signals:

Layer	Signals
Service	Request rate, error rate, latency, dependency latency
Container	CPU, memory, restart count, OOMKilled
Cluster	capacity, pending tasks/pods, node pressure
Ingress	ALB 5xx, target response time, target health
Deployment	version, rollout state, failed deployment

12.3 API Gateway / ALB

Minimum signals:

Signal	Meaning
Request count	Traffic
4xx	Client/auth/input issue
5xx	Service/platform issue
Latency	End-to-end gateway latency
Integration latency	Backend latency
Throttles	Rate/quota issue

12.4 SQS/EventBridge/Step Functions

Minimum signals:

Service	Signals
SQS	Age of oldest message, visible messages, not visible messages, DLQ count
EventBridge	Failed invocations, throttles, DLQ
Step Functions	Failed/timed-out executions, execution duration, state transition failures

For workflow platform, never monitor only request count. Monitor stuck business state.

Example custom metrics:

CasesStuckInReview
EscalationDeadlineBreaches
AuditAppendFailureCount
ManualOverrideCount
ReopenCaseRate

13. Observability for Regulated Case Management

A regulated case-management/enforcement system needs two telemetry planes:

Operational telemetry can be sampled, aggregated, and expired. Evidence records often cannot.

Do not store regulatory evidence only in application logs. Logs are optimized for operations, not necessarily legal defensibility.

Minimum regulated observability model:

Concern	Implementation direction
Who changed case state	Append-only audit event
Why changed	Reason code/comment reference
When changed	Server-side timestamp
Under what authority	Role/permission/context
Was transition valid	Policy/rule version
Was notification sent	Notification event/result
Was SLA breached	Case timer metric + event
Was evidence accessed	Access audit event

14. Security and Privacy

Observability data often contains sensitive information. Treat telemetry as production data.

Rules:

Do not log passwords, tokens, session cookies, private keys, or raw secrets.
Avoid raw PII unless explicitly required and protected.
Use structured redaction libraries.
Encrypt log groups if policy requires customer-managed KMS keys.
Set retention explicitly.
Restrict CloudWatch Logs Insights access.
Separate security logs from application debug logs.
Store audit/evidence records in tamper-resistant storage when required.
Monitor access to logs through CloudTrail.
Use account boundary for centralized logging when appropriate.

Bad pattern:

logger.info("request={}", fullHttpRequest)

Better pattern:

logger.info("request received", fields: method, route, tenant, correlationId, contentLength)

15. Cost Engineering for Observability

Observability cost is real. Cost issues usually come from:

Excessive log volume.
Long retention for noisy logs.
High-cardinality custom metrics.
Too many dashboards/alarms with low value.
Tracing every request in high-volume service.
Copying logs to multiple vendors without filtering.
Debug logs left on in production.

Cost controls:

Control	Benefit
Retention by log group	Avoid infinite storage
Sampling	Reduce trace cost
EMF dimension discipline	Avoid metric explosion
Subscription filters	Export only useful streams
Log level governance	Reduce noise
Aggregated business metrics	Lower cardinality
Separate hot/cold storage	Lower long-term cost

Operational rule:

Every new telemetry stream should have an owner, retention policy, security classification, and known use case.

16. Failure Modes

Failure mode	Symptom	Prevention
No correlation IDs	Incident timeline breaks across services	Standard request/event envelope
Alarm on cause only	Pages for CPU but misses user impact	SLO/symptom alarms
High-cardinality metrics	Cost spike and unusable metrics	Dimension review
Logs contain secrets	Security incident	Redaction and policy tests
No deployment markers	Hard to connect incident with change	Emit deployment events
Async boundary loses trace	Cannot diagnose delayed failures	Propagate context through messages
Dashboard too broad	Slow diagnosis	Layered dashboards
No runbook linked to alarm	On-call improvises	Alarm-to-runbook mapping
Sampling hides failures	Traces missing for errors	Bias sampling toward errors
Audit mixed with debug logs	Regulatory evidence weak	Separate audit store

17. Alarm-to-Runbook Mapping

Every production alarm should have:

alarmName: prod-case-workflow-high-transition-failure-rate
owner: case-platform-team
severity: Sev2
userImpact: case state transitions may fail
slo: case-transition-success-rate
firstChecks:
  - check recent deployments
  - check DynamoDB conditional failures
  - check downstream event publish failures
  - check IAM/KMS errors
mitigation:
  - pause non-critical workflow consumers
  - rollback latest deployment if error started after release
  - increase provisioned capacity only if throttling confirmed
rollbackSafe: conditional
escalation:
  - platform-oncall
  - data-platform-oncall if persistence failure

If an alarm does not have an owner and first response steps, it is not production-ready.

18. Deliberate Practice

Exercise 1: Build a Service Health Dashboard

For one service, define:

Request rate.
Error rate.
p95/p99 latency.
Dependency latency.
Queue age if async.
Current deployment version.
SLO compliance.

Self-check:

Can a new on-call identify impact in under 2 minutes?
Can they see whether the issue is service or dependency?
Can they find the latest deployment/change?

Exercise 2: Design Three Alarms

Create:

One fast-burn SLO alarm.
One slow-burn SLO alarm.
One dependency saturation alarm.

Self-check:

Which alarm pages humans?
Which alarm opens ticket only?
Which alarm triggers automation?

Exercise 3: Trace an Async Flow

Pick a flow:

API -> service -> SQS -> worker -> database -> notification

Propagate:

correlationId
traceId
tenantId
eventId
schemaVersion

Self-check:

Can you reconstruct the full path from one user complaint?
Can you find where latency accumulated?
Can you replay safely if a message failed?

19. Production Checklist

[ ] Every service has owner and service catalog entry.
[ ] Every service emits structured logs.
[ ] Every request has correlation ID.
[ ] Async events carry correlation context.
[ ] Metrics separate service, dependency, and business signals.
[ ] SLO is defined for critical user journeys.
[ ] Page alarms are symptom/SLO-based.
[ ] Cause alarms are used for diagnosis or tickets.
[ ] Dashboards are layered by audience.
[ ] Logs have explicit retention.
[ ] Sensitive fields are redacted.
[ ] Trace sampling strategy is documented.
[ ] Deployment markers are visible.
[ ] Alarm has linked runbook.
[ ] Cost/cardinality review exists for custom metrics.
[ ] Audit/evidence records are separate from debug logs.

20. Summary

Observability engineering di AWS adalah kemampuan membangun sistem yang bisa menjelaskan dirinya sendiri saat gagal.

Inti Part 023:

Observability adalah runtime contract, bukan dashboard.
Metrics untuk alarm dan trend.
Logs untuk forensic dan debug.
Traces untuk causal path.
Events untuk timeline perubahan.
SLO mengubah telemetry menjadi komitmen operasional.
CloudWatch adalah baseline AWS-native yang kuat.
X-Ray dan OpenTelemetry membantu melihat distributed path.
Correlation ID adalah tulang punggung investigasi lintas service.
Telemetry harus aman, hemat, dan punya owner.

Di Part 024, kita akan membahas bagaimana sinyal observability ini dipakai dalam operasi nyata: Systems Manager, Session Manager, Automation, OpsCenter, Incident Manager, runbooks, playbooks, patching, dan incident response.

References

AWS Documentation — What is Amazon CloudWatch?: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html
AWS Documentation — What is CloudWatch Logs?: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html
AWS Documentation — Embedded Metric Format: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format.html
AWS Documentation — AWS X-Ray: https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html
AWS Documentation — AWS Distro for OpenTelemetry and X-Ray: https://docs.aws.amazon.com/xray/latest/devguide/xray-services-adot.html
AWS Documentation — OpenTelemetry in CloudWatch: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OpenTelemetry-Sections.html

Lesson Recap

You just completed lesson 23 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 22

Learn Aws Part 022 Ci Cd Release Safety And Progressive Delivery

Next Lesson

Lesson 24

Learn Aws Part 024 Operations Incident Management Ssm And Runbooks