Final StretchOrdered learning track

Learn Sql In Action Part 034 System Design With Sql And Relational Databases

[]22 min read4228 words

In This Lesson

1. Kaufman Lens: Deconstructing Database System Design 2. Database as Consistency Boundary 3. Unit of Consistency vs Unit of Deployment

PrevNext

Lesson 3435 lesson track30–35 Final Stretch

title: Learn SQL in Action - Part 034 description: System design with SQL and relational databases: OLTP vs OLAP, read replicas, CQRS, outbox/inbox, CDC, transactional messaging, cache invalidation, consistency boundaries, and failure modelling. series: learn-sql-in-action seriesTitle: Learn SQL in Action order: 34 partTitle: System Design with SQL and Relational Databases tags:

sql
database
system-design
cdc
outbox
cqrs
replication
consistency
cache
series date: 2026-07-01

Part 034 — System Design with SQL and Relational Databases

Target skill: mampu menempatkan SQL database secara benar di arsitektur sistem: bukan hanya sebagai tempat menyimpan data, tetapi sebagai consistency boundary, integration source, audit anchor, read model source, dan operational dependency yang punya failure mode.

Outcome praktis: ketika mendesain sistem case management, regulatory workflow, order lifecycle, billing, entitlement, atau audit platform, kita tahu kapan memakai single relational database, kapan memakai replica, kapan memakai outbox/CDC, kapan membuat read model, kapan cache berbahaya, dan kapan sharding/partitioning diperlukan.

SQL skill untuk top engineer tidak berhenti di query. Skill itu harus naik ke system design.

Database relasional adalah salah satu komponen paling kuat dalam sistem karena ia bisa memberikan:

durable state,
transaction boundary,
referential integrity,
concurrency control,
auditability,
queryability,
recovery path,
integration stream,
operational visibility.

Tetapi kekuatan itu datang dengan trade-off:

latency,
contention,
migration risk,
replication lag,
cache inconsistency,
backup/restore complexity,
schema evolution cost,
blast radius,
operational coupling.

Tujuan part ini adalah menyatukan semua materi sebelumnya ke desain sistem.

1. Kaufman Lens: Deconstructing Database System Design

System design dengan SQL bisa dipecah menjadi sub-skill:

Sub-skill	Pertanyaan inti	Feedback cepat
Boundary design	Apa yang harus konsisten dalam satu transaksi?	invariant bisa dilanggar atau tidak
Workload classification	Ini OLTP, OLAP, HTAP, queue, audit, atau search?	query latency dan contention
Data ownership	Siapa source of truth?	conflicting writes
Replication design	Replica dipakai untuk apa?	lag, stale reads
Event propagation	Bagaimana state change keluar dari DB?	lost event / duplicate event
Cache strategy	Apa yang boleh stale?	stale read incident
Migration strategy	Bagaimana schema berubah tanpa downtime?	deployment failure
Failure modelling	Apa yang terjadi saat DB, broker, replica, atau consumer gagal?	incident drill
Observability	Bagaimana mendeteksi query, lock, lag, bloat, pool saturation?	time-to-diagnose

Kita tidak belajar pattern untuk dihafal. Kita belajar constraint dan invariant.

2. Database as Consistency Boundary

Pertanyaan paling penting:

Data apa yang harus berubah bersama-sama atau tidak berubah sama sekali?

Itulah transaction boundary.

Contoh case approval:

update case_file.status,
insert case_status_history,
insert case_decision,
insert audit_event,
insert outbox_event.

Semua ini harus terjadi atomically.

BEGIN;

UPDATE case_file
SET status = 'APPROVED',
    version = version + 1,
    updated_at = current_timestamp
WHERE case_id = :case_id
  AND status = 'UNDER_REVIEW'
  AND version = :expected_version;

INSERT INTO case_status_history (...)
VALUES (...);

INSERT INTO case_decision (...)
VALUES (...);

INSERT INTO audit_event (...)
VALUES (...);

INSERT INTO outbox_event (...)
VALUES (...);

COMMIT;

If any part fails, none should be committed.

This is where relational database shines.

3. Unit of Consistency vs Unit of Deployment

A common mistake: assuming microservice boundary must always split database transaction boundary.

Better distinction:

Concept	Meaning
Unit of consistency	Data that must obey invariant together
Unit of ownership	Team/service responsible for changing data
Unit of deployment	Artifact deployed independently
Unit of scaling	Component scaled independently
Unit of query	Shape needed by user/report/API

These are not always identical.

A regulatory case system may have one strong consistency boundary for case lifecycle, but multiple read models for search, dashboard, notification, and analytics.

4. OLTP vs OLAP vs Operational Analytics

OLTP

OLTP optimizes for small, concurrent, transactional changes.

Examples:

open case,
assign officer,
approve decision,
record payment,
update SLA deadline,
submit evidence.

Characteristics:

short transactions,
strict constraints,
high concurrency,
predictable indexes,
normalized model,
row-level changes,
low latency.

OLAP

OLAP optimizes for large scans, aggregations, joins, and trend analysis.

Examples:

monthly enforcement trend,
officer workload distribution,
cohort retention,
violation severity analysis,
SLA compliance report.

Characteristics:

large scans,
columnar storage useful,
denormalized/star schema useful,
batch/stream ingestion,
higher latency tolerance,
snapshot consistency important.

Operational Analytics

Operational analytics lives between OLTP and OLAP.

Examples:

open cases due in next 24 hours,
backlog by team,
current SLA breach count,
queue health,
stuck workflow detection.

It often runs on OLTP data but can hurt OLTP if not controlled.

Strategy:

indexed summary table,
materialized view,
read replica,
read model database,
CDC-fed dashboard store,
query budget and timeout.

5. Read Replicas

Read replicas are not magic scale-out. They are consistency trade-offs.

Read replicas help when:

read workload is heavy,
stale reads are acceptable,
reports should not overload primary,
failover readiness is needed,
backup can run away from primary.

Read replicas hurt when:

read-after-write consistency is required,
application is unaware of lag,
long analytical queries create replica conflicts,
replica lag causes wrong decisions,
failover assumptions are untested.

6. Read-After-Write Consistency

Problem:

user updates case,
API writes to primary,
next screen reads from replica,
replica has not caught up,
user sees stale state.

This is not a database bug. It is architecture behavior.

Solutions:

Strategy	When useful	Trade-off
Read own writes from primary	user-facing immediate consistency	more primary load
Lag-aware routing	route to primary if replica lag too high	needs lag telemetry
Session consistency token	read from replica only after LSN/binlog position visible	complexity
UI pending state	eventual consistency acceptable	UX design needed
CQRS read model with delay indicator	dashboard/reporting	eventual semantics explicit

For regulatory workflow, do not show approval result from stale replica unless UI explicitly marks freshness.

7. CQRS: Command Model vs Query Model

CQRS means separating write model from read model. It does not mean eventual consistency everywhere by default.

Write model:

normalized,
constraint-heavy,
transactionally correct,
optimized for state changes.

Read model:

denormalized,
query-shaped,
possibly duplicated,
rebuildable,
optimized for user/API/dashboard reads.

CQRS is useful when:

read shape differs greatly from write shape,
query load threatens OLTP,
search/reporting needs denormalized fields,
read model can be rebuilt,
eventual consistency is acceptable for those queries.

CQRS is harmful when:

used to avoid learning SQL joins,
read model becomes source of truth,
projection lag is not observable,
no replay/rebuild strategy exists,
duplicate business rules drift.

8. Transactional Outbox Pattern

Problem: dual write.

Bad flow:

If DB commit succeeds and broker publish fails, state changed but event lost.

Transactional outbox solves this by writing event into the same transaction as business state.

BEGIN;

UPDATE case_file
SET status = 'APPROVED'
WHERE case_id = :case_id
  AND status = 'UNDER_REVIEW';

INSERT INTO outbox_event (
    event_id,
    aggregate_type,
    aggregate_id,
    event_type,
    payload,
    occurred_at,
    published_at
) VALUES (
    :event_id,
    'CaseFile',
    :case_id,
    'CaseApproved',
    :payload,
    current_timestamp,
    NULL
);

COMMIT;

Then a relay publishes the outbox event.

This gives atomic state + event recording, but delivery to broker is usually at-least-once.

Consumers must be idempotent.

9. Outbox Table Design

Example:

CREATE TABLE outbox_event (
    event_id uuid PRIMARY KEY,
    aggregate_type varchar(80) NOT NULL,
    aggregate_id varchar(120) NOT NULL,
    event_type varchar(120) NOT NULL,
    event_version integer NOT NULL,
    payload json NOT NULL,
    headers json NULL,
    occurred_at timestamp NOT NULL,
    available_at timestamp NOT NULL,
    published_at timestamp NULL,
    publish_attempts integer NOT NULL DEFAULT 0,
    last_error text NULL
);

CREATE INDEX idx_outbox_unpublished
ON outbox_event (available_at, occurred_at, event_id)
WHERE published_at IS NULL;

Important fields:

Field	Purpose
`event_id`	idempotency key for consumers
`aggregate_type`	routing and rebuild
`aggregate_id`	ordering/keying per aggregate
`event_type`	schema contract
`event_version`	event evolution
`payload`	business fact
`occurred_at`	source timestamp
`available_at`	delay/retry scheduling
`published_at`	relay progress
`publish_attempts`	poison detection
`last_error`	operational debugging

10. Polling Outbox vs CDC Outbox

Polling Outbox

Relay queries outbox table periodically.

SELECT event_id, payload
FROM outbox_event
WHERE published_at IS NULL
  AND available_at <= current_timestamp
ORDER BY occurred_at, event_id
FETCH FIRST 100 ROWS ONLY
FOR UPDATE SKIP LOCKED;

Pros:

simple,
explicit,
works without CDC infrastructure,
easy to reason about.

Cons:

polling overhead,
locking complexity,
marking published creates more writes,
ordering needs design,
relay failure can leave rows in progress.

CDC Outbox

CDC reads database log and streams inserted outbox rows.

Pros:

avoids polling query,
preserves commit order more naturally depending connector,
scales integration better,
outbox row itself becomes event source.

Cons:

needs CDC infrastructure,
replication slot/binlog retention risk,
connector failure can grow log storage,
schema evolution needs discipline,
operational complexity higher.

11. Change Data Capture

CDC means capturing committed database changes and propagating them to downstream systems.

CDC use cases:

search indexing,
analytics ingestion,
read model projection,
audit streaming,
cache invalidation,
cross-service integration,
migration synchronization.

CDC risks:

lag,
duplicate events,
reordering by key/partition,
schema change breakage,
connector outage,
log retention growth,
snapshot/re-snapshot complexity,
downstream backpressure.

CDC rule:

CDC gives reliable observation of committed changes. It does not remove the need for idempotent consumers and schema contracts.

12. Inbox Pattern

If outbox solves producer-side dual write, inbox solves consumer-side duplicate handling.

Consumer table:

CREATE TABLE consumed_event (
    consumer_name varchar(100) NOT NULL,
    event_id uuid NOT NULL,
    consumed_at timestamp NOT NULL,
    PRIMARY KEY (consumer_name, event_id)
);

Consumer transaction:

BEGIN;

INSERT INTO consumed_event (consumer_name, event_id, consumed_at)
VALUES (:consumer_name, :event_id, current_timestamp);

-- apply business projection/update
UPDATE case_search_projection
SET status = :status,
    updated_at = :occurred_at
WHERE case_id = :case_id;

COMMIT;

If insert conflicts, event was already processed.

This is essential because at-least-once delivery is the realistic default for robust distributed systems.

13. Cache Invalidation

Cache is a second truth unless designed carefully.

Cache strategies:

Strategy	Good for	Risk
Cache-aside	simple reads	stale after write
Write-through	controlled updates	app complexity
Write-behind	high write throughput	data loss/inconsistency risk
CDC invalidation	DB as source of truth	invalidation lag
TTL only	low correctness need	stale window always exists

For regulatory systems, cache domain decisions carefully. Cache reference data and read-only projections more freely than lifecycle state.

14. Cache Key and Invalidation Scope

Bad key:

case:123

Better key includes tenant and representation:

tenant:{tenantId}:case-summary:{caseId}:v3

Invalidation must know:

which aggregate changed,
which query result sets include it,
whether list caches exist,
whether permission-dependent view differs,
whether stale read is acceptable.

For list queries, invalidation is hard. Often better:

short TTL,
event-driven projection,
no list cache for critical workflow queue,
cache only immutable pages/snapshots.

15. Search Index as Read Model

Search index is not source of truth.

Common architecture:

Rules:

index documents should include enough fields for search/list,
canonical detail should come from source DB if correctness matters,
indexer must be idempotent,
reindex from source must be possible,
projection lag must be visible,
delete/soft-delete semantics must be explicit.

16. Reporting and Analytics Architecture

Do not run heavy monthly reports on the OLTP primary unless intentionally budgeted.

Options:

read replica for moderate reports,
ETL/ELT into warehouse/lakehouse,
materialized summary tables,
CDC stream to analytical store,
snapshot export,
DuckDB/local analytical processing for offline/embedded workloads.

Architecture:

Use OLTP for current operational truth. Use analytical store for broad historical scanning.

17. Materialized Summary Tables

For operational dashboards:

CREATE TABLE team_case_backlog_summary (
    tenant_id bigint NOT NULL,
    team_id bigint NOT NULL,
    status varchar(30) NOT NULL,
    case_count bigint NOT NULL,
    breached_count bigint NOT NULL,
    refreshed_at timestamp NOT NULL,
    PRIMARY KEY (tenant_id, team_id, status)
);

Refresh options:

synchronous update in same transaction,
asynchronous event projection,
scheduled recomputation,
hybrid incremental + periodic reconciliation.

Trade-off:

Refresh style	Freshness	Write cost	Complexity
Synchronous	highest	high	medium
Async event	medium	low	high
Batch refresh	low/medium	low	medium
On-demand query	current	query cost high	low

For dashboards, eventual consistency is often acceptable if freshness is visible.

18. Multi-Tenancy with SQL

Common models:

Model	Pros	Cons
Shared DB, shared schema, tenant column	efficient, simple ops	isolation relies on predicates/RLS
Shared DB, schema per tenant	stronger logical separation	migration complexity
DB per tenant	strong isolation, custom backup	operational overhead
Cluster per tenant	strongest isolation	highest cost

Shared schema pattern:

CREATE TABLE case_file (
    tenant_id bigint NOT NULL,
    case_id bigint NOT NULL,
    status varchar(30) NOT NULL,
    opened_at timestamp NOT NULL,
    PRIMARY KEY (tenant_id, case_id)
);

CREATE INDEX idx_case_file_tenant_status_opened
ON case_file (tenant_id, status, opened_at DESC, case_id DESC);

Never forget tenant in unique constraints:

ALTER TABLE case_file
ADD CONSTRAINT uq_case_external_ref
UNIQUE (tenant_id, external_reference);

Tenant isolation is an invariant, not a UI filter.

19. Sharding and Distribution

Sharding is not partitioning with marketing.

Partitioning usually divides data inside one database system. Sharding divides data across independent database nodes.

Use sharding when:

one primary cannot handle write load,
one node cannot hold data economically,
tenant isolation/blast radius requires physical split,
regional data residency requires separate placement.

Avoid sharding when:

query model needs frequent cross-shard joins,
team lacks operational maturity,
bottleneck is missing index/query design,
archive/partitioning would solve the issue.

Shard key questions:

Does it align with transaction boundary?
Does it avoid hot shards?
Can tenants grow unevenly?
How do we move a tenant?
How do we run global reports?
How do we enforce global uniqueness?
How do we handle cross-shard workflow?

20. Cross-Database Transactions

Distributed transactions are expensive operationally and conceptually.

Safer pattern:

keep strong invariant inside one database,
publish event via outbox,
downstream systems update eventually,
use saga/process manager for multi-step workflow,
make every step idempotent,
reconcile periodically.

Saga example:

Not every process can be eventually consistent. Identify which invariant truly requires atomicity.

21. Transactional Messaging Without Illusions

Exactly-once is often a composed illusion built from:

atomic write to source,
durable event log,
idempotent consumer,
deduplication key,
transactional sink,
deterministic replay,
careful offset management.

Design for:

at-least-once delivery,
duplicate events,
delayed events,
partial failure,
replay,
poison messages,
schema evolution.

Consumer idempotency is not optional.

22. Event Schema Design

Bad event:

{
  "status": "APPROVED"
}

Better event:

{
  "eventId": "8a2e...",
  "eventType": "CaseApproved",
  "eventVersion": 2,
  "occurredAt": "2026-07-01T10:15:30Z",
  "tenantId": "tenant-42",
  "caseId": "case-1001",
  "previousStatus": "UNDER_REVIEW",
  "newStatus": "APPROVED",
  "decisionId": "decision-778",
  "approvedBy": "user-91",
  "reasonCode": "COMPLIANT"
}

Event should contain enough context for consumers without forcing them to query source synchronously for every projection.

But do not leak sensitive data unnecessarily.

23. Audit Log vs Event Log

Audit log and event log overlap but are not identical.

Aspect	Audit log	Domain event/outbox
Purpose	accountability, evidence	integration, projection
Audience	compliance, support, investigation	downstream systems
Mutability	append-only, strict	append-only ideally
Payload	who/what/when/why	business fact for consumers
Retention	often long/legal	operational/event retention
Query pattern	reconstruction/investigation	consumption/replay

Do not assume Kafka topic replaces audit table. Auditability often needs local queryable evidence with permission controls.

24. Backup, Restore, and Recovery Design

A database system design is incomplete without recovery thinking.

Questions:

What is RPO?
What is RTO?
Are backups tested?
Can we restore to a point in time?
Are encryption keys backed up?
Are schema migrations recoverable?
How do replicas interact with backups?
How do CDC/outbox consumers recover after restore?
How do we reconcile events emitted after restored point?

Restore drill is not optional for critical systems.

25. Failure Mode: Primary DB Down

Behavior to define:

write APIs fail fast or queue?
read APIs serve stale replica?
admin UI disabled?
background jobs paused?
outbox relay paused?
health checks avoid thundering herd?
circuit breaker protects DB during recovery?

Never let retry storms kill a recovering database.

26. Failure Mode: Replica Lag

Symptoms:

user sees old state,
reports inconsistent,
search result missing recent case,
dashboard undercounts backlog,
failover risks data loss if async replication.

Mitigations:

monitor lag,
route critical reads to primary,
expose freshness timestamp,
limit long queries on replicas,
capacity plan replica apply rate,
alert on lag thresholds,
test failover.

27. Failure Mode: CDC Connector Down

Consequences:

search index stale,
notifications delayed,
analytics delayed,
outbox backlog grows,
WAL/binlog retention grows,
disk risk on primary depending configuration.

Mitigations:

backlog dashboard,
connector health check,
lag metric,
disk alert,
replay runbook,
schema-change compatibility process,
poison event quarantine.

28. Failure Mode: Cache Stale

Questions:

What is maximum acceptable stale window?
Can stale data cause wrong approval/decision?
Does permission change invalidate cached data?
Can user see data after access revoked?
Can list cache omit newly escalated case?

For permission-sensitive data, stale cache can be a security incident.

29. Failure Mode: Migration Half-Complete

A safe migration plan handles:

app v1 and app v2 both running,
nullable transition,
partial backfill,
failed backfill,
constraint validation failure,
rollback not possible,
old code reading new shape,
new code reading old shape,
CDC consumers seeing schema change.

Use expand-contract and feature flags.

30. Case Management Reference Architecture

Core principle:

DB owns canonical lifecycle state.
Audit is written atomically with state change.
Outbox is written atomically with state change.
Read models are rebuildable.
Consumers are idempotent.
Search and dashboard are not source of truth.

31. Write Path Design

Example service flow:

Notice: user success depends on DB commit, not broker publish.

32. Read Path Design

Different reads have different freshness needs.

Read use case	Source	Freshness requirement
confirm approval result	primary OLTP	immediate
case detail audit page	OLTP + audit table	immediate/strong
search list	search index	eventual acceptable if marked
dashboard count	projection/summary	eventual acceptable
legal report	warehouse/snapshot	point-in-time consistency
workflow queue	OLTP or strongly consistent queue table	strong enough to avoid duplicate work

Design read path per use case, not per generic repository.

33. Connection Pool and Backpressure

A relational database can be killed by too many connections even when CPU is not the first bottleneck.

Rules:

use bounded pool,
set statement timeout,
set lock timeout for risky operations,
separate OLTP and reporting pools,
avoid long idle transactions,
use bulkhead per workload,
apply backpressure before DB collapses,
monitor pool wait time.

Connection pool saturation is often the first visible symptom of slow query, lock wait, or DB overload.

34. Query Budgeting

Every query class should have a budget:

Query class	Example	Budget
command write	approve case	low latency, short transaction
user detail read	open case page	low latency
workflow queue	next work item	very low latency, high concurrency
dashboard	backlog summary	moderate latency, bounded
report	monthly compliance	async/off primary
backfill	schema migration	chunked, throttleable
reconciliation	data quality check	scheduled, controlled

Without query budget, every query competes equally. That is operationally false.

35. Database Observability in Architecture

Minimum signals:

query latency by normalized query,
rows examined/returned where available,
lock wait,
deadlock count,
connection pool active/waiting,
transaction age,
replication lag,
CDC lag,
outbox backlog,
cache hit/miss/staleness,
migration duration,
disk growth,
table/index bloat,
vacuum/analyze health where relevant.

Tie signals to user impact:

“approval latency high because update waits on lock”,
“dashboard stale because projection lag 8 minutes”,
“search missing case because CDC connector stopped”,
“database CPU high because report ran on primary”.

36. Data Ownership and Shared Database Trap

Multiple services writing the same tables is dangerous.

Better:

one owner for write model,
other services consume events or use API,
shared read replica only for read-only/reporting with contract,
schema changes controlled by owner,
permissions enforce ownership.

Shared database is not always wrong, but uncontrolled shared writes create hidden coupling.

37. API vs Database Integration

When another system needs data, options include:

Integration	Use when	Risk
Synchronous API	needs current decision/command	latency/coupling
Read replica access	internal reporting	schema coupling
CDC/event stream	downstream projection	eventual consistency
Batch export	periodic reporting	freshness delay
Shared table write	almost never	ownership violation

For regulated data, prefer explicit contracts and auditability over convenience.

38. Reconciliation as Architecture Pattern

Eventual consistency needs reconciliation.

Examples:

-- Cases approved in OLTP but missing from search projection
SELECT c.case_id
FROM case_file c
LEFT JOIN case_search_projection p
  ON p.case_id = c.case_id
WHERE c.status = 'APPROVED'
  AND p.case_id IS NULL;

-- Outbox events not published after threshold
SELECT event_id, event_type, occurred_at, publish_attempts, last_error
FROM outbox_event
WHERE published_at IS NULL
  AND occurred_at < current_timestamp - interval '10 minutes';

Reconciliation jobs should be first-class, not emergency scripts.

39. Decision Heuristics

Use single relational DB when:

invariants cross several tables,
transaction correctness matters,
team size/system scale does not require distribution,
query flexibility is valuable,
operational simplicity matters.

Add read replica when:

reads overload primary,
stale reads are acceptable for some paths,
reporting needs isolation,
failover/read scaling needs justify complexity.

Add outbox/CDC when:

state changes must drive external systems,
dual write risk exists,
read models/search/analytics need reliable updates,
consumers can be idempotent.

Add CQRS read model when:

read shape is expensive on normalized OLTP,
eventual consistency is acceptable,
rebuild/replay is possible,
projection lag is observable.

Add cache when:

data is read-heavy,
staleness is acceptable or controlled,
invalidation is understood,
source of truth remains clear.

Add sharding when:

one primary cannot meet write/storage/isolation requirements,
partitioning/archival/indexing is insufficient,
cross-shard queries are manageable,
operations team can handle it.

40. Anti-Patterns

Anti-pattern: Broker as Source of Truth for Transactional State

Events are important, but canonical transactional state usually belongs in a database with constraints and recovery.

Anti-pattern: Cache Before Index

Many systems add cache because query is slow, when the real issue is missing index or bad query shape.

Anti-pattern: Read Replica for Strong Reads

If user must see their write immediately, route to primary or implement consistency token.

Anti-pattern: Outbox Without Idempotent Consumer

Outbox does not guarantee exactly-once business effect. Consumer must dedupe.

Anti-pattern: CDC Without Operational Ownership

CDC pipeline requires monitoring, schema policy, lag alert, replay plan, and disk/log retention guard.

Anti-pattern: Sharding Before Lifecycle Partitioning

Large table can often be handled with partitioning, archival, indexing, and workload isolation before sharding.

Anti-pattern: Analytics on Primary

A single unbounded report can degrade user-facing writes.

41. Architecture Review Checklist

For every SQL-backed system design, ask:

Consistency

What invariants must be transactionally enforced?
Are they enforced by constraints, guarded writes, or application logic only?
What isolation level is assumed?
Is retry safe?

Ownership

Which service owns writes?
Who can read directly?
Are permissions aligned with ownership?
Are schema changes controlled?

Read Models

Which reads require strong consistency?
Which reads can be stale?
Is freshness visible?
Can read models be rebuilt?

Integration

Is there a dual write?
Is outbox/CDC used?
Are consumers idempotent?
Is event schema versioned?

Operations

What are query budgets?
What are pool limits?
What are lock/statement timeouts?
How is lag monitored?
How are backups tested?
What is the failover plan?

Migration

Is expand-contract used?
Is backfill chunked and throttleable?
Are old and new app versions compatible?
Are CDC consumers schema-compatible?

42. Practice Drill: Design a Regulatory Case Platform

Design the SQL architecture for a case management platform with:

case lifecycle,
assignment,
approval,
escalation,
SLA deadline,
audit trail,
external notification,
search,
dashboard,
monthly compliance report.

Deliverables:

OLTP schema boundaries,
transaction boundary for case approval,
audit table design,
outbox event schema,
idempotent consumer/inbox schema,
search projection design,
dashboard summary design,
read-after-write strategy,
replica usage policy,
cache invalidation policy,
migration strategy,
observability dashboard,
failure-mode table.

Failure-mode table example:

Failure	User impact	Detection	Mitigation
primary DB down	cannot approve case	health check, error rate	fail fast, pause workers, failover
replica lag	stale dashboard	lag metric	freshness badge, route critical reads primary
CDC down	search stale	CDC lag, outbox backlog	restart connector, replay, alert
duplicate event	duplicate notification	inbox conflict, event audit	idempotency key
migration fails mid-backfill	mixed data shape	migration dashboard	resume backfill, roll forward

43. Key Takeaways

A relational database is a consistency boundary, not just storage.
System design starts by identifying invariants that must hold atomically.
Read replicas trade freshness for capacity/isolation.
CQRS is useful when read and write shapes diverge, but read models must be rebuildable.
Outbox prevents producer-side dual-write loss, but consumers must still be idempotent.
CDC is powerful but operationally serious: lag, schema evolution, and log retention matter.
Cache is safe only when staleness and invalidation are explicitly designed.
Sharding is a last-resort distribution strategy, not a substitute for indexing/partitioning.
Every SQL architecture needs observability, recovery, migration, and reconciliation plans.
Top-level SQL mastery means reasoning across query, transaction, schema, and distributed failure boundaries.

The next part is the capstone. We will combine the full series into an end-to-end production casebook: schema, query correctness, optimizer work, transaction anomalies, auditability, migration, observability, and final drills.

Lesson Recap

You just completed lesson 34 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 33

Learn Sql In Action Part 033 Sql Across Engines Portability And Vendor Differences

Next Lesson

Lesson 35

Learn Sql In Action Part 035 Capstone Sql In Action Production Casebook