Build CoreOrdered learning track

Formal Methods for Working Engineers

Learn Java Formal Methods, Testing, Benchmarking, and Performance Engineering - Part 019

Formal methods untuk working Java engineers: invariant, safety, liveness, model checking, contracts, refinement, counterexample analysis, dan cara menjembatani formal specification ke JUnit, property-based testing, CI, dan production evidence.

10 min read1853 words
PrevNext
Lesson 1940 lesson track0922 Build Core
#java#formal-methods#verification#model-checking+8 more

Part 019 — Formal Methods for Working Engineers

Tujuan bagian ini: membuat formal methods menjadi alat kerja engineer Java, bukan topik akademik yang jauh dari implementasi.

Formal methods sering terdengar seperti ini:

theorem proving
mathematical logic
temporal formulas
state space exploration
proof obligations

Bagi working engineer, framing itu terlalu berat di awal.

Framing yang lebih berguna:

Formal methods = cara menulis desain sistem secara cukup presisi sehingga mesin atau manusia dapat menemukan kesalahan desain sebelum kode produksi dibuat.

Formal methods bukan pengganti engineering judgement. Ia memperkuat judgement.

Formal methods bukan pengganti test. Ia menemukan jenis bug yang test biasa sering lewatkan.

Formal methods bukan kewajiban untuk seluruh codebase. Ia adalah alat khusus untuk area yang failure cost-nya tinggi, state-space-nya rumit, atau concurrency/distributed behavior-nya sulit dipikirkan manual.


1. Masalah Nyata yang Ingin Diselesaikan

Test biasa biasanya menanyakan:

Untuk contoh input ini, apakah output-nya benar?

Formal methods menanyakan:

Untuk semua state yang mungkin di model ini, apakah properti penting selalu benar?

Perbedaannya besar.

Dalam sistem Java production, bug paling mahal sering bukan bug syntax atau salah mapping DTO. Bug mahal biasanya berbentuk:

state lifecycle invalid
retry menyebabkan double side effect
concurrent command melanggar invariant
approval/resolution terjadi dalam urutan salah
event diterbitkan tanpa perubahan state yang sah
compensation tidak mengembalikan invariant
timeout menghasilkan status ambigu
cache dan database disagree
message redelivery membuat transaksi tidak idempotent
optimistic locking gagal tetapi caller menganggap sukses

Bug seperti ini jarang ditemukan oleh satu unit test sederhana. Mereka muncul dari kombinasi interleaving, ordering, duplicate delivery, stale reads, dan partial failure.

Formal methods memberi bahasa untuk menyatakan:

Apa state sistem?
Apa action yang mengubah state?
Properti apa yang tidak boleh dilanggar?
Progress apa yang akhirnya harus terjadi?
Abstraksi implementasi mana yang harus tetap konsisten dengan model?

2. Formal Methods dalam Evidence Ladder

Dari Part 003, kita sudah punya verification ladder. Formal methods menempati posisi yang berbeda dari test biasa.

Kekuatan formal methods bukan karena ia lebih “tinggi” dari test. Kekuatannya karena ia menjawab pertanyaan berbeda.

EvidencePertanyaan yang DijawabKelemahan
Unit testContoh perilaku ini benar?Lemah terhadap kombinasi state besar
Property-based testProperti ini bertahan untuk banyak input generated?Generator dan oracle bisa salah
Mutation testTest suite benar-benar mendeteksi perubahan salah?Tidak membuktikan desain benar
Integration testKomponen benar saat dependency nyata dipakai?Mahal, lambat, tidak exhaustive
Formal modelSemua state di model kecil ini aman?Model bisa tidak merepresentasikan realitas
Production telemetrySistem aktual sehat?Biasanya setelah risiko terjadi

Formal methods berguna ketika pertanyaan utamanya adalah:

Can this design fail in some ordering we did not imagine?

3. Definisi Kerja: Model, State, Action, Property

Agar tidak abstrak, gunakan empat konsep inti.

3.1 Model

Model adalah representasi sengaja-disederhanakan dari sistem.

Model bukan kode produksi. Model juga bukan diagram cantik.

Model harus cukup presisi untuk dieksekusi atau dianalisis.

Contoh sistem Java:

Case Management Service
- menerima command submit/assign/escalate/resolve
- menyimpan case state di PostgreSQL
- menerbitkan domain event melalui outbox
- memakai optimistic locking
- command bisa retry karena timeout
- event consumer bisa menerima duplicate event

Model tidak perlu memasukkan semua field database. Ia cukup memasukkan state yang relevan terhadap correctness.

caseStatus
caseVersion
processedCommandIds
outboxEvents
assignedOfficer
escalationFlag

3.2 State

State adalah snapshot informasi yang menentukan perilaku berikutnya.

Dalam Java system, state bisa tersebar:

heap object
database row
message queue
cache
scheduler
external service state
clock-derived state

Formal model memaksa kita memilih state yang relevan.

Kalau state penting tidak masuk model, model bisa memberi confidence palsu.

3.3 Action

Action adalah perubahan state.

Dalam Java, action bisa berupa:

HTTP request handler
message consumer handler
scheduled job
database transaction
retry attempt
compensation step
manual admin action

Dalam model, action harus eksplisit.

Contoh:

SubmitCase
AssignCase
EscalateCase
ResolveCase
RetryCommand
PublishOutboxEvent
ConsumeEvent
CrashBeforeCommit
CrashAfterCommitBeforeAck

3.4 Property

Property adalah klaim yang ingin kita pertahankan.

Ada dua keluarga utama:

safety  = bad thing never happens
liveness = good thing eventually happens

Safety:

Resolved case must never become InReview again.
One case must never have two active assignees.
A domain event must never be published without corresponding committed state.
The same command id must never cause two different outcomes.

Liveness:

A submitted case is eventually reviewed if reviewers are available.
An outbox event is eventually published if publisher keeps running.
A retryable command eventually either succeeds or returns a stable failure.

Safety biasanya lebih mudah dan lebih sering dimulai. Liveness sering lebih sulit karena membutuhkan fairness assumption.


4. Formal Methods Tidak Harus Dimulai dari Theorem Proving

Ada spektrum.

Untuk Java engineer, urutan praktis biasanya:

  1. tulis invariant secara eksplisit;
  2. validasi invariant dengan unit/property tests;
  3. buat model kecil untuk lifecycle atau concurrency protocol;
  4. jalankan model checker untuk mencari counterexample;
  5. ubah counterexample menjadi regression test;
  6. pasang runtime checks/telemetry untuk invariant yang penting;
  7. ulangi saat desain berubah.

Theorem proving berguna, tetapi bukan entry point paling produktif untuk kebanyakan team.


5. Kapan Formal Methods Layak Dipakai

Formal methods tidak perlu untuk semua hal.

Gunakan ketika ada kombinasi berikut:

stateful behavior
concurrency
retries
idempotency
distributed consistency
lifecycle rules
cross-entity constraints
financial/regulatory impact
expensive incident cost
hard-to-reproduce bugs
ambiguous requirements

Contoh area Java yang cocok:

AreaKenapa Cocok
Workflow lifecycleBanyak state dan transition invalid
Payment/order processingDouble charge, lost update, idempotency risk
Outbox/inbox messagingOrdering, duplicate, crash windows
Distributed lock/leaseExpiry, split brain, stale owner
Approval/escalation rulesGuard priority dan conflict rule
Cache invalidationStale read dan write/read reordering
Authorization policyConstraint kombinatorial
Scheduler/retry systemTime, retry, failure, backoff interaction
Case managementState, actor, SLA, audit, assignment, terminality

Jangan gunakan formal methods untuk:

straightforward CRUD mapping
simple DTO validation
UI layout logic
low-impact glue code
behavior yang sudah jelas dan murah diuji biasa

Formal methods harus dipakai seperti microscope: untuk bagian kecil tetapi kritikal.


6. Invariant-First Engineering

Top engineer tidak memulai dari class diagram. Mereka memulai dari invariant.

Contoh requirement lemah:

Case should be assigned correctly.

Requirement ini tidak cukup.

Ubah menjadi invariant:

A case in ASSIGNED state has exactly one active assignee.
A case in NEW state has no active assignee.
A RESOLVED case has no active escalation timer.
A CLOSED case cannot receive any domain-changing command.
Every emitted CaseAssigned event corresponds to exactly one committed assignment transition.

Invariant yang baik punya ciri:

observable
falsifiable
state-based
small enough to test
important enough to protect
independent from implementation detail where possible

Contoh Java assertion-level representation:

public final class CaseInvariants {
    public static void assertValid(CaseAggregate c) {
        if (c.status() == CaseStatus.NEW && c.assignee().isPresent()) {
            throw new InvariantViolation("NEW case must not have assignee");
        }

        if (c.status() == CaseStatus.ASSIGNED && c.assignee().isEmpty()) {
            throw new InvariantViolation("ASSIGNED case must have assignee");
        }

        if (c.status().isTerminal() && c.hasActiveEscalationTimer()) {
            throw new InvariantViolation("terminal case must not have active escalation timer");
        }
    }
}

Ini belum formal verification. Tetapi ini langkah formalization.

Setelah invariant eksplisit, ia bisa dipakai oleh:

unit tests
property-based tests
model-based tests
runtime checks
migration validation
production diagnostics

7. Safety vs Liveness dengan Contoh Java

7.1 Safety

Safety berarti sesuatu yang buruk tidak pernah terjadi.

Contoh:

No duplicate terminal event is emitted for the same case.

Dalam Java, bug safety bisa muncul seperti ini:

@Transactional
public void resolve(ResolveCaseCommand command) {
    CaseEntity entity = repository.findById(command.caseId());

    if (entity.status() == RESOLVED) {
        return;
    }

    entity.resolve(command.resolution());
    repository.save(entity);
    outbox.insert(CaseResolvedEvent.from(entity));
}

Kelihatannya idempotent. Tetapi dua transaksi concurrent bisa sama-sama membaca status belum RESOLVED bila locking/versioning salah. Hasilnya bisa dua outbox event.

Safety property:

For each caseId, at most one CaseResolved event exists.

7.2 Liveness

Liveness berarti sesuatu yang baik akhirnya terjadi.

Contoh:

If an outbox event exists and the publisher keeps running, the event is eventually published.

Tetapi liveness selalu bergantung pada assumption:

publisher thread eventually gets CPU
message broker eventually accepts publish
network eventually recovers
database remains available

Tanpa assumption, liveness tidak realistis.

Formal modeling memaksa kita menyebut assumption ini, bukan menyembunyikannya di kata “eventually”.


8. Refinement: Dari Model Abstrak ke Implementasi Java

Model formal biasanya abstrak. Java implementation detail biasanya konkret.

Refinement adalah klaim bahwa implementasi konkret tidak menunjukkan behavior yang melanggar model abstrak.

Contoh model abstrak:

ResolveCase atomically changes status to RESOLVED and appends CaseResolved event.

Implementasi konkret:

HTTP request
parse JSON
authorize actor
load aggregate
validate transition
update PostgreSQL row
insert outbox row
commit transaction
publisher later sends Kafka event
consumer eventually handles event

Refinement mapping menghubungkan keduanya:

Abstract ModelJava Implementation
caseStatus[id]case_table.status
outbox[id]outbox_event rows
processedCommandsidempotency_key table
ResolveCase actionDB transaction in command handler
Publish actionoutbox poller transaction
terminal invariantdomain + DB constraint + tests + telemetry

Tujuan bukan membuktikan seluruh implementation bit-by-bit. Tujuan praktisnya:

setiap konsep correctness penting punya mapping implementasi yang jelas.

Kalau tidak ada mapping, model hanya dokumentasi.


9. Counterexample adalah Hadiah

Formal methods sering dianggap gagal ketika model checker menemukan violation. Sebaliknya, itu sukses.

Counterexample adalah execution trace yang menunjukkan bug desain.

Contoh trace:

1. Case C is SUBMITTED
2. Worker A reads C
3. Worker B reads C
4. Worker A resolves C and emits event E1
5. Worker B resolves C and emits event E2
6. invariant AtMostOneTerminalEvent violated

Trace ini jauh lebih valuable daripada error message biasa.

Workflow yang benar:

Counterexample harus menjadi artifact engineering. Simpan sebagai:

issue link
spec file
trace explanation
regression test
design decision record

10. Tooling Map untuk Java Engineer

Tool/ApproachCocok UntukOutput
Structured invariantsSemua domain pentingInvariant catalog
JML/OpenJMLMethod/class contract JavaContract checks, static/runtime verification
TLA+/TLCConcurrent/distributed/stateful protocolsCounterexample trace, checked invariants/liveness
PlusCalAlgorithmic process-style modelsTLA+ spec generated from pseudo-code-like syntax
AlloyRelational/domain constraintsCounterexample instance, model visualization
jqwikExecutable generated testsFailing seed, shrunk input
Model-based testingModel-to-test bridgeGenerated command sequences
jcstressJava memory/concurrency edge behaviorObserved outcomes across interleavings
Runtime assertionsProduction guardrailEarly violation signal
JFR/custom eventsProduction evidenceTraceable runtime behavior

Pilih tool berdasarkan shape masalah, bukan popularitas.

Relational constraint? Alloy.
Concurrent/distributed transition? TLA+.
Java method-level contract? JML/OpenJML.
Runtime concurrency primitive? jcstress.
Input/domain generator? jqwik.

11. Formal Methods Workflow yang Realistis

Gunakan workflow ini untuk feature kritikal.

Step 1 — Select Critical Behavior

Jangan mulai dari seluruh service. Pilih satu behavior.

Contoh:

idempotent resolve command
outbox publish semantics
case escalation timer
approval matrix
assignment transfer

Step 2 — Define Boundary

Tentukan apa yang masuk model.

In scope:
- command status
- case status
- processed command ids
- outbox events
- retry behavior

Out of scope:
- HTTP serialization
- database index structure
- JSON field names
- UI behavior

Boundary buruk membuat model tidak berguna.

Step 3 — List State Variables

Gunakan variabel abstrak.

caseStatus
caseVersion
commands
outbox
locks
published

Jangan langsung menyalin schema database. Model bukan ERD.

Step 4 — List Actions

Action harus mewakili semua cara state berubah.

ReceiveCommand
ProcessCommand
CommitState
AppendOutbox
PublishEvent
RetryCommand
Crash
Recover

Jika crash/retry tidak dimodelkan, model tidak bisa menemukan bug crash/retry.

Step 5 — Write Safety Properties

Mulai dari invariant paling mahal.

NoDuplicateResolution
NoEventWithoutStateChange
TerminalStateImmutable
VersionMonotonic
ProcessedCommandOutcomeStable

Step 6 — Add Liveness If Needed

Tambahkan liveness hanya jika progress penting.

Every queued outbox event is eventually published under fair publisher scheduling.

Step 7 — Run Model Checker or Analyzer

Gunakan scope kecil.

2 cases
2 commands
2 workers
2 retries

Scope kecil sering cukup untuk menemukan bug desain besar.

Step 8 — Translate Counterexample to Tests

Setiap counterexample penting harus menjadi regression test.

@Test
void concurrentResolveMustNotEmitDuplicateTerminalEvents() {
    // Reproduce the trace found by the model.
}

Step 9 — Keep Model Alive

Model mati kalau tidak ikut berubah saat desain berubah.

Masukkan ke review checklist:

Does this change affect a modeled invariant?
Does the model need a new action?
Does a new failure mode appear?
Do existing counterexample tests still represent the risk?

12. Pattern: Modeling Idempotency

Idempotency adalah area paling sering disalahpahami.

Statement lemah:

Request is idempotent.

Statement kuat:

For the same idempotency key and semantically same request body, repeated execution returns the same stable outcome and does not create additional domain side effects.

State yang perlu dimodelkan:

processedKeys
keyToOutcome
domainState
sideEffects
inFlightCommands

Safety properties:

same key never maps to two outcomes
same key never creates two domain events
same key with different body is rejected
in-flight failure does not create ambiguous visible state

Java implementation mapping:

idempotency_key table
unique index on key
request hash
stored response or outcome reference
transaction around domain mutation + idempotency record
retry-safe status transitions

Test mapping:

unit: duplicate command returns same result
integration: unique index prevents race
property: generated duplicate/reordered commands keep invariant
model: crash/retry interleaving cannot duplicate side effect
load test: retry storm does not violate invariant

13. Pattern: Modeling Outbox Correctness

Outbox pattern sering dianggap selesai hanya dengan tabel outbox. Tidak cukup.

Invariant yang perlu dijaga:

No published event without committed state.
No committed transition that requires event is permanently invisible.
No event is interpreted twice by non-idempotent consumers.
Event payload corresponds to committed version.

State model:

dbState
outboxRows
publishedEvents
consumerState

Actions:

CommitBusinessTransaction
PollOutbox
PublishToBroker
MarkPublished
ConsumerHandle
ConsumerRetry

Failure windows:

crash before commit
crash after commit before publish
crash after publish before mark published
broker accepts but ack lost
consumer processes but offset commit fails

Formal thinking reveals a core truth:

Outbox gives atomicity between DB state and event record, not exactly-once end-to-end processing.

Therefore Java design still needs:

idempotent publisher
idempotent consumer
unique event id
consumer deduplication or natural idempotency
observable publish lag
reconciliation job

14. Pattern: Modeling Workflow Terminality

Terminal state bugs are common.

Example invariant:

Once CLOSED, a case cannot transition to any non-terminal state.

But real systems have admin correction. So invariant may need refinement:

Only a correction action with explicit audit reason can reopen a closed case.
Normal workflow commands cannot reopen terminal cases.

This distinction matters.

Bad invariant:

Closed cases never change.

Better invariant:

Closed cases never change through normal workflow actions.
Correction actions create a new audit-linked revision and preserve terminal history.

Java mapping:

command type hierarchy
transition guard
permission check
audit table
case revision number
history event

Formal model should include correction action only if it exists. Otherwise the implementation will eventually “exception-case” its way around the model.


15. Pattern: Modeling Authorization and Separation of Duties

Authorization bugs are relational. They often fit Alloy better than TLA+.

Example invariant:

The officer who recommends enforcement action cannot be the same officer who approves it.

Related constraints:

case has assigned investigator
case has reviewer
reviewer belongs to permitted department
approver has required role
same user cannot occupy conflicting roles
emergency override requires audit reason

This can be represented as relational constraints.

Java test mapping:

role matrix tests
policy engine tests
generated user/role/case graph tests
negative authorization tests
audit trail tests

Formal point:

If the problem is about relationships and constraints, do not force it into example tests only.

16. Formal Specification as Design Review Artifact

A formal spec is not just a file. It is a design review tool.

Good design review questions:

What are the state variables?
What actions can change them?
What states are invalid?
What failure windows exist?
What happens on retry?
What happens on duplicate message?
What happens if two actors act concurrently?
What progress assumptions are needed?
Which properties are safety and which are liveness?
What implementation elements enforce each invariant?
What telemetry proves the invariant is holding in production?

This moves review away from taste:

I prefer this design.

Toward evidence:

This design permits duplicate terminal events under this interleaving.

17. Common Anti-Patterns

17.1 Modeling Code Instead of Behavior

Bad:

Model every class, repository, DTO, and mapper.

Good:

Model the state transitions and failure windows that determine correctness.

17.2 Hiding the Environment

If retry, duplicate delivery, timeout, crash, and stale read are possible in production, they must appear in the model.

A model that assumes perfect execution proves little.

17.3 Only Checking Happy Path Invariants

Bad invariant:

Resolved case has resolved status.

This is almost tautological.

Better:

A case has at most one terminal resolution event across all retries and concurrent attempts.

17.4 Treating Model Checker Pass as Proof of Production Correctness

Model checker pass means:

No violation found within this model and configuration.

It does not mean:

The production system is correct.

The model may omit a failure mode. The scope may be too small. The invariant may be wrong. The Java implementation may not refine the model.

17.5 No Counterexample-to-Test Workflow

If a counterexample does not become a regression test, the same bug can return.

Formal methods should feed the normal engineering pipeline.


18. Example: From Requirement to Formal Artifact

Requirement:

A case can be escalated if it remains assigned but unresolved past the SLA threshold.

Hidden ambiguity:

What if case is resolved exactly when escalation job runs?
What if assignment changes?
What if SLA clock is paused?
What if escalation command is retried?
What if two scheduler instances run?
What if case is closed by correction?

Formalization:

State:

status
assignee
assignedAt
slaPaused
escalated
version
processedEscalationKeys

Actions:

Assign
PauseSla
ResumeSla
Resolve
RunEscalationJob
RetryEscalation
AdminClose

Safety:

Resolved or closed case is not escalated by normal job.
Case is escalated at most once per assignment epoch.
Escalation event references the assignment epoch it escalated.
Duplicate escalation command does not create duplicate event.

Liveness:

If a case remains assigned, unresolved, unpaused, and scheduler keeps running after SLA threshold, then escalation eventually occurs.

Java design implications:

assignmentEpoch column
unique index: case_id + assignment_epoch + event_type
transactional check of current status and epoch
scheduler idempotency key
outbox event includes assignmentEpoch
consumer deduplication by eventId

This is formal methods doing real work. It changes schema, command design, and event payload.


19. Integration with Java Test Strategy

Formal methods should not live alone.

Mapping example:

Formal ArtifactJava Artifact
Invariantassertion/helper/test oracle
Actioncommand handler test step
Counterexample traceregression test scenario
State variableaggregate field/table/message state
Environment actionfake dependency/failure injector
Liveness assumptionscheduler/load/integration test condition
Refinement mappingarchitecture decision record

20. What to Put in the Repository

A mature codebase can include:

/specs
  /tla
    CaseEscalation.tla
    CaseEscalation.cfg
    README.md
  /alloy
    SeparationOfDuties.als
  /jml
    Money.jml-notes.md
/src/test/java
  ... counterexample regression tests
/docs/adr
  ADR-014-idempotent-case-resolution.md

README for each spec should explain:

problem modeled
state variables
actions
properties checked
known abstractions
known omissions
how to run
last meaningful design change
linked tests

Without this, specs become archaeology.


21. Review Checklist

Use this checklist before modeling:

Is the behavior stateful?
Can two actors operate concurrently?
Can commands/messages be retried?
Can external failures happen between important steps?
Is there an invariant with high business/regulatory/financial value?
Can we define a small finite model?
Can counterexamples be translated to tests?
Is the team willing to maintain the spec?

Use this checklist after modeling:

Did we model the dangerous failure windows?
Did we check non-trivial safety properties?
Did we document liveness assumptions?
Did we avoid modeling irrelevant implementation detail?
Did we map abstract variables to Java/database/event artifacts?
Did we create regression tests for discovered bugs?
Did we add telemetry for production invariant evidence?

22. Mental Model Final

Formal methods are not about proving that you are smart. They are about making system behavior explicit enough that hidden design bugs have fewer places to hide.

A strong Java engineer uses formal methods like this:

not everywhere
not ceremonially
not as a replacement for tests
not as academic theater

but surgically,
for critical stateful behavior,
with executable specs,
linked to tests,
linked to implementation,
linked to production evidence.

The practical formula:

invariant + state + action + failure mode + counterexample + regression test + runtime evidence

That is the bridge from formal reasoning to production engineering.


23. What Comes Next

Part 020 will go deeper into TLA+.

We will model Java system behavior with:

state variables
actions
Init
Next
Spec
invariants
liveness
TLC model checking
counterexample interpretation
Java implementation mapping

The focus will not be syntax for its own sake. The focus will be using TLA+ to design Java systems that involve:

workflow state
idempotent commands
retry
outbox
concurrency
optimistic locking
failure windows
Lesson Recap

You just completed lesson 19 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.