Database Design Interview and Review Drills
Learn Database Design and Architect - Part 083
Advanced database design interview and architecture review drills for practicing production-grade reasoning, tradeoff analysis, invariant modelling, workload analysis, and failure-mode review.
Part 083 — Database Design Interview and Review Drills
Goal bagian ini: mengubah seluruh konsep seri ini menjadi latihan nyata.
Database architect yang kuat bukan hanya bisa menjelaskan normalization, index, transaction, atau CAP. Ia bisa mengambil problem ambigu, mengubahnya menjadi model data, mengunci invariant, memilih boundary, menilai risiko, dan mempertahankan keputusan desain dengan evidence.
Bagian ini berisi drill. Formatnya sengaja dibuat mirip internal engineering review: scenario, hidden ambiguity, expected reasoning, design sketch, failure modes, and review questions.
Tidak ada satu jawaban absolut untuk semua drill. Yang dinilai adalah struktur berpikir.
1. Cara Menggunakan Drill Ini
Gunakan setiap drill sebagai latihan 45–90 menit.
Format latihan:
- Baca scenario.
- Tulis assumptions.
- Identifikasi source of truth.
- Definisikan entity, lifecycle, dan invariant.
- Buat conceptual/logical/physical model.
- Tentukan transaction boundary.
- Tentukan index berdasarkan workload.
- Tentukan migration/evolution path.
- Tentukan observability dan failure mode.
- Tulis ADR ringkas.
Output ideal dari setiap drill:
1. Problem statement
2. Scope and non-scope
3. Data authority map
4. Entity lifecycle diagram
5. Logical schema
6. Critical invariants
7. Main write flows
8. Main read flows
9. Index strategy
10. Consistency and transaction strategy
11. Security/privacy boundary
12. Migration strategy
13. Failure modes
14. Open risks
15. ADR decision summary
2. Rubric Evaluasi Database Architect
Gunakan rubric ini untuk menilai jawaban sendiri atau tim.
| Area | Junior Answer | Strong Answer | Architect-Level Answer |
|---|---|---|---|
| Requirement | Langsung bikin tabel | Tanya field yang kurang | Mengurai business invariant, lifecycle, authority, dan failure mode |
| Data model | Entity-list based | Normalized schema | Model berdasarkan grain, lifecycle, ownership, dan access pattern |
| Constraint | Validasi di aplikasi | Sebagian constraint di DB | Invariant diklasifikasikan: DB/app/job/manual/evidence |
| Index | Tambah index untuk query lambat | Index berdasarkan query | Index sebagai workload contract dengan write amplification awareness |
| Transaction | BEGIN / COMMIT | Hindari lost update | Memilih isolation/lock/idempotency sesuai invariant |
| Evolution | Migration script | Backward-compatible DDL | Expand–migrate–contract, backfill, validation, rollback path |
| Security | Role aplikasi | Tenant filter | Defense-in-depth: RLS, privilege, audit, break-glass, export control |
| Operability | Log error | Basic metric | SLO, runbook, dashboard, incident classification, restore proof |
| Tradeoff | Pilih teknologi populer | Bandingkan opsi | Membuat decision matrix dengan cost of change dan blast radius |
3. Review Ladder: Urutan Pertanyaan yang Benar
Jangan mulai review database dari “table apa saja?”. Mulai dari truth and invariants.
Mental model:
Table is not the design. Table is the compiled artifact of business truth, lifecycle, invariants, workload, and operational constraints.
Drill 1 — Regulatory Case Intake and Assignment
Scenario
Sebuah regulator menerima laporan pelanggaran dari publik, internal monitoring, dan lembaga eksternal. Setiap laporan dapat berubah menjadi satu atau lebih case. Case memiliki assignment ke investigator, SLA, evidence, note, decision, dan enforcement action.
Kebutuhan:
- Laporan dapat duplicate.
- Case bisa digabung atau dipecah.
- Assignment punya history.
- SLA berbeda berdasarkan risk level.
- Evidence harus punya chain of custody.
- Semua decision harus bisa diaudit.
- User hanya boleh melihat case sesuai unit, role, dan sensitivity.
Hidden Ambiguity
Pertanyaan yang harus muncul:
- Apakah intake report selalu menjadi case?
- Apakah duplicate report tetap disimpan?
- Apakah merge menghapus case lama atau membuat relation?
- Apakah assignment current state atau event history?
- Apakah SLA dihitung dari intake, triage, assignment, atau state tertentu?
- Apakah evidence metadata dan binary object disimpan di database yang sama?
- Apakah decision bisa direvisi?
Expected Reasoning
Pisahkan minimal entity berikut:
intake_report
case_file
case_relation
case_assignment
case_state_transition
case_sla_clock
case_evidence
case_evidence_custody_event
case_decision
case_enforcement_action
case_access_grant
case_audit_event
Core Invariant
| Invariant | Enforcement |
|---|---|
| Case number unique | Unique constraint |
| Active assignment maksimal satu per role/case | Partial unique index |
| State transition harus legal | Transition table + transactional guard |
| Evidence custody event append-only | Insert-only table + restricted privileges |
| Decision references evidence snapshot | FK + immutable evidence version |
| User access tidak by convention | RLS/policy/materialized grant |
Schema Sketch
create table case_file (
case_id uuid primary key,
case_number text not null unique,
current_state text not null,
risk_level text not null,
sensitivity_level text not null,
owning_unit_id uuid not null,
opened_at timestamptz not null,
closed_at timestamptz,
version bigint not null default 0,
created_at timestamptz not null default now(),
updated_at timestamptz not null default now(),
check (closed_at is null or closed_at >= opened_at)
);
create table case_assignment (
assignment_id uuid primary key,
case_id uuid not null references case_file(case_id),
role_code text not null,
assignee_user_id uuid not null,
assigned_by_user_id uuid not null,
assigned_at timestamptz not null,
unassigned_at timestamptz,
reason text not null,
check (unassigned_at is null or unassigned_at >= assigned_at)
);
create unique index uq_active_case_assignment
on case_assignment(case_id, role_code)
where unassigned_at is null;
Review Questions
- Apa source of truth untuk current case state?
- Apa difference antara
case_file.current_statedancase_state_transition? - Bagaimana mencegah dua investigator menjadi primary assignee secara race condition?
- Bagaimana merge case direpresentasikan tanpa kehilangan audit trail?
- Bagaimana report yang duplicate tetap menjadi evidence?
- Bagaimana reconstruct state pada tanggal tertentu?
- Query apa yang paling sering muncul untuk work queue?
- Index apa yang dibutuhkan untuk SLA dashboard?
- Data apa yang masuk retention/purge?
- Bagaimana user access diuji secara automated?
Red Flags
case.statusmenjadi tempat semua meaning.- Assignment overwrite tanpa history.
- Evidence diganti in-place.
- Access control hanya
WHERE unit_id = ?di aplikasi. - Merge case dilakukan dengan delete.
- SLA dihitung ulang dari data ambigu.
Drill 2 — High-Volume Transaction Ledger
Scenario
Bangun ledger untuk transaksi finansial internal. Sistem menerima command debit/credit dari payment processor, partner API, dan batch settlement. Requirement:
- Tidak boleh double-posting.
- Balance harus konsisten.
- Correction tidak boleh update transaksi lama.
- Harus support reconciliation harian.
- Latency write p95 < 50 ms.
- Query balance by account harus cepat.
- Audit harus bisa membuktikan asal setiap posting.
Expected Model
ledger_account
ledger_journal
ledger_entry
account_balance_projection
posting_command
external_reference
reconciliation_batch
reconciliation_difference
Core Principle
Ledger bukan tabel balance yang di-update sembarangan.
Ledger adalah:
immutable journal + derived balance projection + reconciliation proof
Write Flow
Critical Invariants
| Invariant | Enforcement |
|---|---|
| Idempotency key unique per source | Unique constraint |
| Journal balanced: sum debit = sum credit | Deferred validation or controlled function |
| Entry immutable | No update/delete privilege |
| Balance projection derived from entries | Reconciliation job |
| Correction via reversal/new journal | Business rule + restricted API |
Interview Questions
- Mengapa
balancetidak cukup sebagai source of truth? - Bagaimana idempotency key didesain?
- Apa yang terjadi jika API timeout setelah commit?
- Bagaimana membedakan duplicate request dan changed retry?
- Bagaimana melakukan correction?
- Bagaimana reconcile projection balance dengan journal?
- Apakah account row perlu di-lock?
- Bagaimana menghindari hot account bottleneck?
- Bagaimana partitioning ledger_entry?
- Apa tradeoff antara realtime balance dan eventual projection?
Failure Modes
- Duplicate posting karena retry tanpa idempotency.
- Lost update pada balance projection.
- Reversal salah arah.
- Reconciliation tidak reproducible.
- External reference tidak unique.
- Backfill historical ledger memicu index/WAL pressure.
Drill 3 — Multi-Tenant SaaS with Enterprise Isolation
Scenario
SaaS B2B memiliki 20.000 tenant. Sebagian kecil enterprise tenant butuh dedicated database, data residency, custom retention, dan private support access. Mayoritas tenant menggunakan pooled database.
Kebutuhan:
- Tenant kecil murah dioperasikan.
- Enterprise tenant bisa dimigrasikan ke dedicated DB.
- Query report per tenant cepat.
- Backup/restore tenant penting.
- Support engineer tidak boleh bebas melihat data tenant.
- Noisy tenant tidak boleh menurunkan semua tenant.
Architecture Options
| Model | Fit | Risk |
|---|---|---|
| Pooled schema | Cost efficient | Isolation by discipline |
| Schema per tenant | Medium isolation | Migration/DDL overhead |
| DB per tenant | Strong isolation | Operational explosion |
| Hybrid/cell | Balanced | Routing/catalog complexity |
Expected Components
tenant_catalog
tenant_region
tenant_database_placement
tenant_plan
tenant_feature_flag
tenant_access_policy
tenant_restore_request
tenant_migration_job
tenant_metric_daily
Key Design Question
Is
tenant_idpart of every primary key, every foreign key, every index, every audit event, every outbox event, and every metric dimension?
Kalau tidak, ada kemungkinan tenant isolation bocor.
Review Questions
- Bagaimana tenant ditemukan dari request?
- Apakah tenant routing terjadi sebelum query DB?
- Apakah FK composite memakai
tenant_id? - Bagaimana mencegah cross-tenant join?
- Apakah RLS dipakai sebagai safety net?
- Bagaimana restore satu tenant dari backup?
- Bagaimana migrasi tenant pool → dedicated?
- Bagaimana per-tenant rate limit dan capacity?
- Bagaimana support access diaudit?
- Bagaimana report global lintas tenant dibuat tanpa bocor PII?
Red Flags
- Tenant filter hanya di service layer tanpa guardrail.
- Foreign key tidak tenant-scoped.
- Unique constraint global padahal seharusnya per tenant.
- Tenant migration dianggap export/import sederhana.
- Backup strategy hanya database-level tanpa tenant restore drill.
Drill 4 — Schema Migration from Status Soup
Scenario
Legacy system memiliki tabel:
customer_case(
id,
status,
sub_status,
assigned_to,
approved_by,
rejected_reason,
closed_reason,
escalated_flag,
pending_flag,
deleted_flag,
updated_at
)
Masalah:
- Banyak kombinasi status ilegal.
- Report berbeda antar tim.
- Tidak ada transition history.
- Sulit menambahkan workflow baru.
- Migration harus zero downtime.
Expected Refactoring Strategy
Gunakan expand–migrate–contract.
New Model
case_file
case_state_transition
case_assignment
case_escalation
case_decision
case_closure
Review Questions
- Apa canonical source of truth selama migration?
- Apakah old status bisa dipetakan losslessly?
- Data mana yang tidak bisa dipulihkan dari legacy shape?
- Bagaimana menangani illegal historical data?
- Bagaimana dual-write diuji?
- Bagaimana rollback setelah cutover?
- Bagaimana menghindari lock besar saat backfill?
- Query mana yang perlu compatibility view?
- Bagaimana report lama divalidasi dengan report baru?
- Kapan old columns aman dihapus?
Architect-Level Answer
Jawaban matang tidak hanya membuat tabel baru. Ia menyatakan:
- transitional source of truth;
- compatibility layer;
- data repair plan;
- backfill chunking;
- validation query;
- cutover gate;
- rollback/roll-forward decision;
- communication ke consumer.
Drill 5 — Event-Driven Projection Rebuild
Scenario
Operational database mengirim event ke search index, warehouse, cache, dan notification service. Setelah bug ditemukan, search projection harus direbuild dari database tanpa kehilangan update baru.
Expected Architecture
canonical_db
outbox_event
cdc_relay
event_broker
projection_consumer
projection_checkpoint
projection_rebuild_job
projection_validation_result
Rebuild Strategy
Review Questions
- Apa watermark yang dipakai untuk snapshot?
- Bagaimana event baru selama rebuild diproses?
- Apakah projection update idempotent?
- Apakah event ordering per aggregate dijamin?
- Bagaimana delete/tombstone dipropagate?
- Bagaimana schema version event ditangani?
- Bagaimana authorization data ikut projection?
- Bagaimana validasi hasil rebuild?
- Bagaimana rollback alias projection?
- Apa SLO freshness projection?
Red Flags
- Rebuild dengan full scan tanpa watermark.
- Projection tidak punya version/alias.
- Consumer tidak idempotent.
- Delete event diabaikan.
- Authorization filter hanya di search API, bukan indexed document.
Drill 6 — Database for Approval Workflow with Delegation
Scenario
Sistem approval procurement harus support:
- approval berjenjang;
- delegation sementara;
- substitute approver;
- maker-checker;
- approval timeout;
- audit decision;
- rule berubah dari waktu ke waktu.
Expected Model
approval_request
approval_step
approval_assignment
approval_decision
approval_policy_version
approval_delegation
approval_timeout_event
Key Invariants
| Invariant | Enforcement |
|---|---|
| Maker tidak boleh approve request sendiri | Transactional guard |
| Satu active decision per step | Unique partial index |
| Delegation hanya valid dalam effective period | Temporal constraint/query guard |
| Approval policy version immutable after activation | Restricted update |
| Decision append-only | Insert-only privilege |
Review Questions
- Apakah approval rule disimpan sebagai data atau code?
- Bagaimana policy version dipilih saat request dibuat?
- Apakah perubahan policy berdampak ke request berjalan?
- Bagaimana delegation direpresentasikan?
- Bagaimana timeout diproses?
- Bagaimana mencegah double approval?
- Bagaimana audit membuktikan approver eligible saat itu?
- Bagaimana report approval bottleneck dibuat?
- Bagaimana rollback jika policy salah?
- Bagaimana migration dari old approval rule?
Drill 7 — Data Retention and Legal Hold
Scenario
Sistem menyimpan customer document, case note, communication, audit event, dan exported report. Beberapa data harus dipurge setelah 7 tahun. Namun jika ada legal hold, purge tidak boleh terjadi. User juga dapat meminta erasure untuk PII tertentu.
Expected Model
data_asset_inventory
retention_policy
retention_policy_assignment
legal_hold
purge_candidate
purge_job
purge_job_item
privacy_erasure_request
erasure_action_log
Key Questions
- Apa asset inventory-nya?
- Apa retention trigger date untuk setiap entity?
- Apakah legal hold entity-level atau subject-level?
- Bagaimana PII ditemukan di projection/search/warehouse?
- Apakah audit event boleh dihapus atau dimask?
- Bagaimana backup menangani erasure?
- Bagaimana purge dibuktikan?
- Bagaimana purge job dibuat idempotent?
- Bagaimana partial purge tidak merusak referential integrity?
- Bagaimana retention policy versioned?
Red Flags
deleted_atdianggap sama dengan retention.- Legal hold hanya boolean di main table.
- Tidak ada data inventory.
- Projection/search tidak ikut erase.
- Purge tanpa evidence log.
Drill 8 — Hot Work Queue
Scenario
Sistem memiliki tabel task queue. Ribuan worker mengambil task dari database. Saat traffic naik, database mengalami lock contention, CPU tinggi, dan banyak timeout.
Legacy query:
select *
from task
where status = 'READY'
order by priority desc, created_at asc
limit 1
for update;
Expected Diagnosis
Masalah potensial:
- semua worker berebut row/index range yang sama;
- query tidak pakai
SKIP LOCKED; - priority tinggi membuat hotspot;
- task status update menciptakan bloat;
- index tidak sesuai predicate/order;
- retry worker memperparah load.
Improved Pattern
with claimed as (
select task_id
from task
where status = 'READY'
and available_at <= now()
and bucket_id = :worker_bucket
order by priority desc, created_at asc
limit 10
for update skip locked
)
update task t
set status = 'RUNNING',
claimed_by = :worker_id,
claimed_at = now()
from claimed c
where t.task_id = c.task_id
returning t.*;
Review Questions
- Apa queue ordering guarantee yang benar-benar dibutuhkan?
- Apakah strict global priority worth the contention?
- Bagaimana membagi queue ke bucket?
- Bagaimana retry dan dead-letter queue dimodelkan?
- Bagaimana lease timeout bekerja?
- Bagaimana mencegah task lost saat worker crash?
- Index apa yang dibutuhkan?
- Bagaimana mengukur queue lag?
- Kapan database queue harus diganti broker?
- Bagaimana migrasi tanpa kehilangan task?
Drill 9 — Search + Authorization Projection
Scenario
Case management system butuh search lintas case, party, evidence metadata, note, dan decision. Search harus cepat, tetapi user hanya boleh melihat case yang authorized.
Expected Design
Search index bukan source of truth. Ia projection.
case_file canonical
case_access_grant authorization truth
case_search_document projection source
search_index external projection
search_index_checkpoint operational state
Review Questions
- Data apa yang masuk search document?
- Apakah authorization di-index atau dicek balik ke DB?
- Bagaimana revocation access dipropagate?
- Apakah sensitive field dimask sebelum indexing?
- Bagaimana delete/tombstone?
- Bagaimana rebuild index?
- Bagaimana stale search result ditangani?
- Bagaimana search result count dipakai di UI jika filter authorization post-query?
- Bagaimana multi-tenant search isolation?
- Bagaimana audit search access?
Red Flags
- Search index mengandung PII tanpa classification.
- Authorization hanya post-filter setelah mengambil 10 result.
- Access revocation tidak punya event.
- Rebuild index tidak deterministic.
- Tidak ada projection version.
Drill 10 — Global Distributed Application
Scenario
Aplikasi dipakai di Asia, Europe, dan US. Setiap tenant punya home region. Beberapa operasi harus local-latency, beberapa harus globally visible. Data tertentu tidak boleh keluar region.
Expected Reasoning
Klasifikasikan operasi:
| Operation | Consistency Need | Region Strategy |
|---|---|---|
| Tenant config read | Strong enough / cached | Home region + cache |
| Case update | Strong per case | Home region ownership |
| Global search | Eventual | Regional projection |
| User login | Low latency | Global identity/cache |
| Audit write | Durable/local | Home region + replicated metadata |
| Cross-region transfer | Controlled | Saga + ownership transfer |
Review Questions
- Apa unit of locality: tenant, account, case, user, or document?
- Bagaimana request routing ke home region?
- Bagaimana data residency enforced?
- Apakah global transaction benar-benar dibutuhkan?
- Bagaimana conflict dihindari?
- Apa yang terjadi saat region partition?
- Bagaimana failover menulis tanpa split-brain?
- Bagaimana stale global search dikomunikasikan?
- Bagaimana backup/restore per region?
- Bagaimana audit chain lintas region?
Drill 11 — Analytics Metric Contract
Scenario
Leadership dashboard menampilkan jumlah open cases, average resolution time, SLA breach, enforcement amount, dan investigator workload. Angka sering berbeda antara dashboard A dan dashboard B.
Expected Design
Buat semantic metric contract.
metric_definition
metric_version
metric_dimension
metric_run
metric_result
metric_reconciliation
Questions
- Apa definisi
open case? - Apakah reopen case dihitung sebagai case baru?
- Apa grain dari metric?
- Apa snapshot time?
- Apakah timezone ditentukan?
- Apakah deleted/merged case masuk?
- Bagaimana data late-arriving ditangani?
- Bagaimana metric versioning?
- Bagaimana dashboard mereferensikan metric version?
- Bagaimana reconcile ke source of truth?
Red Flags
- Metric didefinisikan ulang di setiap dashboard.
- Tidak ada grain.
- Tidak ada effective date.
- Query langsung ke OLTP untuk leadership dashboard berat.
- Tidak ada reproducible report run.
Drill 12 — Incident: Sudden Database Latency Spike
Scenario
Pukul 10:15, latency API naik dari p95 80 ms ke 2 detik. Error timeout meningkat. CPU DB 70%, I/O naik, active connections penuh, dan banyak query menunggu lock.
Expected Triage
First Queries
select pid, state, wait_event_type, wait_event, query, now() - query_start as age
from pg_stat_activity
where state <> 'idle'
order by age desc;
select blocked.pid as blocked_pid,
blocked.query as blocked_query,
blocking.pid as blocking_pid,
blocking.query as blocking_query
from pg_locks blocked_locks
join pg_stat_activity blocked on blocked.pid = blocked_locks.pid
join pg_locks blocking_locks
on blocking_locks.locktype = blocked_locks.locktype
and blocking_locks.database is not distinct from blocked_locks.database
and blocking_locks.relation is not distinct from blocked_locks.relation
and blocking_locks.page is not distinct from blocked_locks.page
and blocking_locks.tuple is not distinct from blocked_locks.tuple
and blocking_locks.virtualxid is not distinct from blocked_locks.virtualxid
and blocking_locks.transactionid is not distinct from blocked_locks.transactionid
and blocking_locks.classid is not distinct from blocked_locks.classid
and blocking_locks.objid is not distinct from blocked_locks.objid
and blocking_locks.objsubid is not distinct from blocked_locks.objsubid
and blocking_locks.pid <> blocked_locks.pid
join pg_stat_activity blocking on blocking.pid = blocking_locks.pid
where not blocked_locks.granted
and blocking_locks.granted;
Review Questions
- Apa deployment terakhir?
- Apakah ada migration/backfill/report baru?
- Query apa paling banyak wait time?
- Apakah lock blocker aman dihentikan?
- Apakah connection storm dari aplikasi?
- Apakah query plan berubah karena stats stale?
- Apakah autovacuum tertahan long transaction?
- Apakah replica lag memaksa traffic ke primary?
- Apa mitigation tercepat tanpa data corruption?
- Apa permanent fix?
Drill 13 — Document Store vs Relational Store Decision
Scenario
Product team ingin menyimpan form submission dinamis. Setiap form punya field berbeda, conditional section, attachment, reviewer note, validation rule, dan report tahunan.
Pilihan:
- Semua field jadi JSON di PostgreSQL.
- Dynamic EAV table.
- Document database.
- Relational core + JSON extension.
- Form engine dengan versioned schema + materialized reporting table.
Expected Decision Criteria
| Question | Why It Matters |
|---|---|
| Apakah field perlu query/filter global? | Menentukan index/reporting design |
| Apakah schema form versioned? | Menentukan reproducibility |
| Apakah validation harus immutable per submission? | Menentukan evidence contract |
| Apakah attachment binary atau metadata? | Menentukan storage boundary |
| Apakah report butuh normalized dimension? | Menentukan analytical projection |
Review Questions
- Apa yang menjadi source of truth: form definition atau submission JSON?
- Apakah submitted data harus mempertahankan label/rule saat submit?
- Bagaimana indexing field dinamis?
- Bagaimana mencegah JSON dump tanpa contract?
- Bagaimana report lintas form dibuat?
- Apakah validation rule bisa berubah?
- Bagaimana migration form version?
- Bagaimana privacy classification field?
- Bagaimana search?
- Bagaimana audit correction?
Drill 14 — Access Control Bug in Multi-Tenant Reporting
Scenario
Bug ditemukan: seorang admin tenant A bisa melihat aggregate report yang mencakup tenant B. Tidak ada raw record yang bocor, tetapi aggregate count dan revenue tenant B muncul dalam dashboard.
Expected Incident Analysis
Ini tetap security incident. Aggregate leakage adalah information disclosure.
Investigasi:
- Apakah OLTP tenant filter benar?
- Apakah analytical pipeline membawa tenant_id?
- Apakah semantic layer filter tenant?
- Apakah cache key tenant-aware?
- Apakah materialized view dibuat global lalu difilter terlambat?
- Apakah RLS aktif di reporting DB?
- Apakah dashboard query bypass service?
- Apakah export endpoint punya filter berbeda?
Fix Pattern
source tenant boundary
→ CDC includes tenant_id
→ warehouse table includes tenant_id
→ semantic metric includes tenant dimension
→ dashboard role maps to tenant scope
→ cache key includes tenant scope
→ automated negative test
Review Questions
- Di layer mana tenant boundary hilang?
- Apakah aggregate report dianggap sensitive?
- Bagaimana historical incorrect report ditangani?
- Apakah audit log cukup untuk impact assessment?
- Apa preventive control?
- Apa detective control?
- Apa test yang harus ditambah?
- Apakah incident perlu customer notification?
- Bagaimana rebuild materialized report?
- Apa governance change setelah incident?
Drill 15 — Database Design Document Review
Scenario
Tim mengajukan desain database untuk fitur baru. Mereka membawa ERD dan migration script. Review kamu 60 menit.
Review Flow
Jangan langsung review DDL. Pakai urutan ini:
1. Problem and scope
2. Data authority
3. Lifecycle
4. Invariants
5. Workload
6. Schema
7. Transaction
8. Index
9. Migration
10. Security/privacy
11. Operations
12. Failure modes
13. Decision and actions
Questions to Ask
Authority
- Tabel mana source of truth?
- Ada duplicate truth?
- Ada projection/cache/search/warehouse?
Lifecycle
- Bagaimana entity dibuat?
- Siapa boleh mengubah?
- Apa terminal state?
- Apa correction path?
Invariant
- State apa yang ilegal?
- Mana yang DB-enforced?
- Mana yang app-enforced?
- Mana yang async-validated?
Workload
- Top 10 query path apa?
- Write rate?
- Read rate?
- Cardinality?
- Data growth?
- Tenant skew?
Migration
- Apakah DDL blocking?
- Apakah backward compatible?
- Apakah ada backfill?
- Bagaimana validation?
- Bagaimana rollback?
Operations
- Metric apa yang membuktikan healthy?
- Runbook apa yang tersedia?
- Backup/restore impact?
- Alert threshold?
Review Outcome
Gunakan outcome eksplisit:
APPROVED
APPROVED WITH CONDITIONS
NEEDS REVISION
BLOCKED
Jangan gunakan “LGTM” untuk desain database yang punya irreversible impact.
Advanced Interview Question Bank
Data Modelling
- Bedakan conceptual, logical, physical model.
- Apa grain dari tabel dan kenapa grain penting?
- Kapan relationship harus menjadi entity sendiri?
- Kapan natural key lebih baik dari surrogate key?
- Bagaimana memodelkan hierarchy yang berubah historis?
- Apa masalah polymorphic foreign key?
- Kapan JSON di relational DB masuk akal?
- Apa tanda EAV sudah menjadi design smell?
- Bagaimana memodelkan correction vs update?
- Bagaimana bitemporal modelling bekerja?
Transactions and Concurrency
- Jelaskan lost update dan cara mencegahnya.
- Apa beda optimistic locking dan pessimistic locking?
- Apa itu write skew?
- Mengapa serializable bukan silver bullet?
- Kapan perlu parent-row lock?
- Bagaimana mendesain idempotent command?
- Apa commit-unknown problem?
- Bagaimana outbox mencegah dual-write?
- Apa efek long transaction pada MVCC?
- Bagaimana mendesain retry policy yang aman?
Index and Query Performance
- Apa arti sargability?
- Bagaimana urutan column composite index dipilih?
- Kapan partial index lebih baik?
- Mengapa index bisa memperlambat write?
- Bagaimana membaca estimated vs actual rows?
- Apa penyebab query plan regression?
- Mengapa OFFSET pagination buruk di data besar?
- Bagaimana keyset pagination bekerja?
- Kapan covering index berguna?
- Kapan index tidak dipakai planner?
Distributed Systems
- Apa bedanya replication, partitioning, dan sharding?
- Apa itu read-your-writes consistency?
- Bagaimana replica lag memengaruhi UX?
- Apa beda 2PC dan saga?
- Apa failure mode saga compensation?
- Apa tradeoff CAP/PACELC dalam aplikasi nyata?
- Apa unit of locality di multi-region design?
- Bagaimana mencegah split-brain write?
- Apa data residency implication untuk schema?
- Kapan distributed SQL masuk akal?
Security, Privacy, and Compliance
- Apa bedanya authorization di aplikasi dan RLS di database?
- Apa risiko
BYPASSRLS? - Bagaimana support break-glass access diaudit?
- Bagaimana sensitive data masuk search index?
- Apa beda soft delete dan retention purge?
- Bagaimana legal hold memengaruhi purge?
- Bagaimana erasure request dipropagate ke warehouse/search/cache?
- Apa privacy risk pada audit log?
- Apa itu data minimization dalam schema design?
- Bagaimana membuktikan report reproducible?
Operations
- Apa first query saat lock storm?
- Bagaimana membedakan CPU-bound vs I/O-bound DB?
- Apa tanda index bloat?
- Apa metric untuk replication lag?
- Apa yang harus ada di restore drill?
- Bagaimana mengukur capacity envelope?
- Apa failure mode migration backfill?
- Apa runbook untuk disk/WAL full?
- Bagaimana menilai noisy tenant?
- Apa evidence bahwa database production-ready?
Architecture Review Scoring Sheet
Gunakan scoring 0–3.
| Dimension | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| Data authority | Tidak jelas | Sebagian jelas | Jelas | Jelas + documented + tested |
| Invariants | Tidak disebut | App-only | DB/app split | Enforced + tested + monitored |
| Workload | Tidak ada | Query list | Workload profile | Workload + growth + benchmark |
| Schema | Ad hoc | Reasonable | Normalized/fit | Evolvable + defensible |
| Index | Guessing | Basic | Query-aligned | Cost-aware + lifecycle managed |
| Transaction | Implicit | Basic transaction | Race-aware | Isolation/lock/retry/idempotency explicit |
| Migration | Blocking | Script only | Backward-compatible | Expand–migrate–contract + validation |
| Security | App convention | Role-based | Scoped | Defense-in-depth + audit |
| Privacy | Ignored | PII noted | Classified | Retention/erasure/projection covered |
| Observability | Logs only | Some metrics | Dashboard | SLO/runbook/drill/evidence |
| Recovery | Backup assumed | Backup configured | Restore tested | RPO/RTO proven + tenant/partial strategy |
| Decision | Opinion | Some rationale | ADR | ADR + alternatives + consequences |
Interpretation:
0–12 : Not production-ready
13–24 : Prototype / internal-only
25–30 : Needs revision before launch
31–36 : Launchable with conditions
37–42 : Production-ready
43–48 : High-confidence architecture
Final Drill: 30-Minute Database Architecture Defense
Ambil salah satu desain database yang pernah kamu buat. Siapkan jawaban untuk panel review:
- Apa source of truth utama?
- Apa 5 invariant paling penting?
- State ilegal apa yang database tolak?
- Apa top 5 query dan index-nya?
- Apa top 3 write command dan transaction boundary-nya?
- Apa concurrency race paling berbahaya?
- Apa migration paling berisiko?
- Apa data yang harus di-retain/purge?
- Apa security boundary paling kritis?
- Apa yang terjadi saat database lambat?
- Apa yang terjadi saat replica stale?
- Apa yang terjadi saat event consumer tertinggal?
- Apa restore path?
- Apa scaling trigger?
- Apa keputusan yang paling sulit dibalik desain ini?
Jika kamu bisa menjawab 15 pertanyaan ini tanpa defensif dan dengan evidence, desainmu sudah berada di level engineering review yang matang.
What Excellent Looks Like
Jawaban kuat biasanya punya pola seperti ini:
We chose X because invariant A and workload B matter more than tradeoff C.
We rejected Y because it would weaken recovery path D and increase operational risk E.
We enforce correctness at layer F, validate drift with job G, and monitor symptom H.
If assumption I changes, we will revisit using migration path J.
Database architect yang matang tidak berkata:
This is best practice.
Ia berkata:
Given these invariants, workload, failure modes, and operational constraints, this is the lowest-risk design we can defend today.
Checklist Penutup Part 083
Sebelum lanjut ke final synthesis, pastikan kamu bisa:
- Mengubah scenario ambigu menjadi entity lifecycle.
- Menemukan hidden invariant.
- Menentukan source of truth vs projection.
- Mendesain schema dengan constraint yang meaningful.
- Menentukan index dari query shape, bukan feeling.
- Menjelaskan transaction boundary dan concurrency risk.
- Mendesain zero-downtime migration.
- Menilai security/privacy boundary.
- Membuat failure-mode analysis.
- Menulis ADR database yang bisa dipertahankan.
Kalau semua drill ini terasa “banyak”, itu normal. Database architecture memang bukan hanya SQL. Ia adalah desain sistem kebenaran, waktu, perubahan, dan kegagalan.
You just completed lesson 83 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.