Deepen PracticeOrdered learning track

Bitemporal and Correction Pipelines

Learn Java Data Pipeline Pattern - Part 056

Bitemporal and correction pipeline patterns for effective time, recorded time, auditability, restatement, reproducible history, and regulatory defensibility.

[2026-07-04]15 min read2985 words

In This Lesson

1. The Problem: Time Is Not One Thing 2. Bitemporal Model in One Sentence 3. Why Bitemporal Matters in Pipelines

PrevNext

Lesson 5684 lesson track46–69 Deepen Practice

#java#data-pipeline#bitemporal#corrections+5 more

Part 056 — Bitemporal and Correction Pipelines

Most pipeline bugs are not caused by lack of data.
They are caused by confusing when something happened, when we learned it, and when it became effective.

A normal table asks:

What is the value now?

A serious regulatory or audit-grade pipeline must answer:

What did we believe on date X about facts effective on date Y?

That is a bitemporal question.

This part covers bitemporal and correction pipeline patterns: how to model effective time, recorded time, correction events, restatement, audit views, and reproducible historical truth in Java data systems.

1. The Problem: Time Is Not One Thing

Consider a regulatory enforcement case.

Case C-100 was closed effective 2026-03-31.
The closure was entered into the case system on 2026-04-02.
The closure was corrected on 2026-04-10 because the effective date should have been 2026-03-30.
A report generated on 2026-04-05 used the old effective date.
A report generated on 2026-04-12 should use the corrected date.

There are at least three times:

Time	Meaning
Event time	When source says the event occurred
Effective/valid time	When the fact is true in the business domain
Recorded/transaction time	When the system learned or recorded the fact

Sometimes there is also:

Time	Meaning
Processing time	When pipeline processed it
Source commit time	When DB transaction committed
Publication time	When output became visible to consumers
Report run time	When a consumer generated a derived artifact

If you collapse all of these into one created_at, you lose the ability to explain history.

2. Bitemporal Model in One Sentence

A bitemporal model stores facts across two axes:

valid time      = when the fact is true in the real/business world
transaction time = when the system recorded or believed the fact

Alternative names:

Concept	Also called
Valid time	effective time, business time, actual time
Transaction time	recorded time, system time, assertion time, knowledge time

Use names that fit your domain, but keep the axes separate.

3. Why Bitemporal Matters in Pipelines

Pipeline outputs are often consumed later by:

reports
dashboards
audits
ML features
regulatory submissions
operational decisions
incident investigations
legal discovery

These consumers may need different truth modes.

Consumer question	Required model
What is the current accepted state?	Latest valid fact by key
What was true on March 31?	Valid-time query
What did the system believe on April 5?	Transaction-time query
What did we report on April 5 about March?	Bitemporal query + report run lineage
Why did numbers change?	Correction/restatement lineage
Who changed the fact and why?	Audit metadata

Without bitemporal data, teams often rewrite history silently.

That is dangerous in enforcement, finance, healthcare, legal, public sector, and compliance-heavy systems.

4. A Simple Example

Initial fact:

recorded_at: 2026-04-02T10:00:00Z
valid_from:  2026-03-31T00:00:00Z
valid_to:    infinity
case_id:     C-100
status:      CLOSED

Correction:

recorded_at: 2026-04-10T09:00:00Z
valid_from:  2026-03-30T00:00:00Z
valid_to:    infinity
case_id:     C-100
status:      CLOSED
correction_of: previous fact
reason:      EFFECTIVE_DATE_CORRECTION

A query as of April 5 should see effective date March 31.

A query as of April 12 should see effective date March 30.

A query asking “what is currently accepted for March 30?” should use the corrected fact.

5. Bitemporal Dimensions

A robust event/fact model often stores:

business_key
valid_from
valid_to
recorded_from
recorded_to
payload
assertion_id
supersedes_assertion_id
reason
source_lineage

The intervals are usually half-open:

[from, to)

Meaning:

start inclusive, end exclusive

This avoids ambiguous boundary overlap.

Example:

valid_from <= query_valid_time < valid_to
recorded_from <= query_recorded_time < recorded_to

6. Event Time vs Valid Time

Do not assume event time and valid time are identical.

Example:

Event emitted: 2026-04-02
Decision effective date: 2026-03-31

The event happened in the system on April 2. The business fact applies from March 31.

For case lifecycle modelling, this distinction is critical:

assignment entered late
escalation effective retroactively
sanction decision backdated by legal rule
appeal suspends SLA from prior date
correction changes effective boundary

A stream processor may use event time for watermark/windowing, but the business state machine may use valid time for domain truth.

7. Transaction Time vs Processing Time

Transaction/recorded time should represent when the source system committed or recorded the assertion.

Processing time represents when the pipeline happened to process it.

These are different.

source recorded_at: 2026-04-02 10:00
pipeline processed_at: 2026-04-03 01:00

If the pipeline was delayed, processing time should not rewrite source history.

Use processing time for pipeline observability and lineage, not business truth.

8. The Correction Principle

The safest correction model is:

Never mutate a past assertion silently.
Add a new assertion that supersedes or corrects it.

This creates an audit trail.

Bad:

update case_status
set effective_date = '2026-03-30'
where case_id = 'C-100';

Better:

insert into case_status_assertion (..., valid_from, recorded_from, supersedes_assertion_id, reason)
values (..., '2026-03-30', now(), 'old-assertion-id', 'EFFECTIVE_DATE_CORRECTION');

Then close the old assertion in transaction-time:

update case_status_assertion
set recorded_to = now()
where assertion_id = 'old-assertion-id';

The old fact remains queryable for “what did we believe before correction?”

9. Bitemporal Table Design

A canonical bitemporal table:

create table case_status_bitemporal (
    assertion_id varchar primary key,
    case_id varchar not null,
    status varchar not null,

    valid_from timestamp not null,
    valid_to timestamp not null,

    recorded_from timestamp not null,
    recorded_to timestamp not null,

    source_event_id varchar not null,
    source_system varchar not null,
    source_commit_time timestamp null,

    supersedes_assertion_id varchar null,
    correction_reason varchar null,
    produced_by_run_id varchar not null,
    transform_version varchar not null,

    payload_hash varchar not null,
    created_at timestamp not null
);

Use an infinity convention carefully.

Examples:

9999-12-31T00:00:00Z

or database-native infinity if supported and portable enough for your stack.

Indexes:

create index idx_case_status_valid
on case_status_bitemporal (case_id, valid_from, valid_to);

create index idx_case_status_recorded
on case_status_bitemporal (case_id, recorded_from, recorded_to);

create index idx_case_status_bitemporal_query
on case_status_bitemporal (case_id, valid_from, valid_to, recorded_from, recorded_to);

For lakehouse tables, partition carefully. Valid date and recorded date can both matter, but over-partitioning creates small files.

10. Bitemporal Query Patterns

10.1 Current accepted state

select *
from case_status_bitemporal
where case_id = 'C-100'
  and valid_from <= current_timestamp
  and current_timestamp < valid_to
  and recorded_to = timestamp '9999-12-31 00:00:00';

This means:

currently valid and currently accepted

10.2 Business truth as of valid time

select *
from case_status_bitemporal
where case_id = 'C-100'
  and valid_from <= timestamp '2026-03-31 12:00:00'
  and timestamp '2026-03-31 12:00:00' < valid_to
  and recorded_to = timestamp '9999-12-31 00:00:00';

This uses current accepted knowledge to ask what was true then.

10.3 What we believed at recorded time

select *
from case_status_bitemporal
where case_id = 'C-100'
  and valid_from <= timestamp '2026-03-31 12:00:00'
  and timestamp '2026-03-31 12:00:00' < valid_to
  and recorded_from <= timestamp '2026-04-05 00:00:00'
  and timestamp '2026-04-05 00:00:00' < recorded_to;

This answers:

What did we believe on April 5 about the status valid on March 31?

That is the core bitemporal query.

11. Event Model for Corrections

Correction events should be explicit.

{
  "eventId": "evt-correction-001",
  "eventType": "CaseStatusCorrected",
  "caseId": "C-100",
  "correctsEventId": "evt-status-777",
  "correctionReason": "EFFECTIVE_DATE_CORRECTION",
  "old": {
    "status": "CLOSED",
    "effectiveFrom": "2026-03-31T00:00:00Z"
  },
  "new": {
    "status": "CLOSED",
    "effectiveFrom": "2026-03-30T00:00:00Z"
  },
  "recordedAt": "2026-04-10T09:00:00Z",
  "sourceCommitTime": "2026-04-10T09:00:02Z",
  "causationId": "cmd-correct-status-001",
  "correlationId": "case-C-100-correction-20260410"
}

Important fields:

Field	Why it matters
`correctsEventId`	Links correction to prior assertion
`correctionReason`	Explains why history changed
`old`	Optional but useful for evidence/diff
`new`	New assertion payload
`recordedAt`	Transaction/knowledge time
`effectiveFrom`	Valid/business time
`causationId`	Who/what caused correction
`correlationId`	Groups related correction workflow

12. Correction Pipeline Architecture

Key idea:

The ledger is the durable truth.
Projections are views derived from the ledger.

Do not make the projection the source of truth.

13. Assertion Ledger vs Projection

Layer	Purpose	Mutation style
Assertion ledger	Preserve every assertion/correction	Append + close recorded interval
Current projection	Fast current-state query	Upsert by business key
As-of view	Historical query	Derived query or materialized table
Reporting aggregate	Consumer-specific product	Restated/versioned

If a correction arrives, update the ledger first. Then rebuild or update projections.

14. Java Domain Model

Use value objects for time axes. Do not pass raw Instant everywhere.

public record ValidTimeRange(Instant fromInclusive, Instant toExclusive) {
    public ValidTimeRange {
        if (!fromInclusive.isBefore(toExclusive)) {
            throw new IllegalArgumentException("valid time range must be non-empty");
        }
    }

    public boolean contains(Instant t) {
        return !t.isBefore(fromInclusive) && t.isBefore(toExclusive);
    }
}

public record RecordedTimeRange(Instant fromInclusive, Instant toExclusive) {
    public RecordedTimeRange {
        if (!fromInclusive.isBefore(toExclusive)) {
            throw new IllegalArgumentException("recorded time range must be non-empty");
        }
    }
}

Assertion:

public record CaseStatusAssertion(
        AssertionId assertionId,
        CaseId caseId,
        CaseStatus status,
        ValidTimeRange validTime,
        RecordedTimeRange recordedTime,
        SourceEventId sourceEventId,
        Optional<AssertionId> supersedesAssertionId,
        Optional<CorrectionReason> correctionReason,
        OutputLineage lineage
) {}

Correction command:

public record CorrectionCommand(
        CaseId caseId,
        AssertionId assertionToCorrect,
        CorrectionReason reason,
        ValidTimeRange correctedValidTime,
        CaseStatus correctedStatus,
        Instant recordedAt,
        SourceEventId sourceEventId
) {}

Do not represent correction as a blind update.

15. Bitemporal Write Algorithm

For a correction:

1. find currently recorded assertion being corrected
2. verify correction is authorized and causally valid
3. close old assertion's recorded interval
4. insert corrected assertion with new recorded_from
5. update projection
6. emit correction lineage if needed

Pseudo-code:

public void applyCorrection(CorrectionCommand command) {
    Instant recordedAt = command.recordedAt();

    CaseStatusAssertion old = repository.getOpenRecordedAssertion(
        command.assertionToCorrect()
    );

    CaseStatusAssertion closedOld = old.withRecordedTo(recordedAt);

    CaseStatusAssertion corrected = new CaseStatusAssertion(
        AssertionId.newDeterministic(command.sourceEventId()),
        command.caseId(),
        command.correctedStatus(),
        command.correctedValidTime(),
        new RecordedTimeRange(recordedAt, TimeConstants.INFINITY),
        command.sourceEventId(),
        Optional.of(old.assertionId()),
        Optional.of(command.reason()),
        currentLineage()
    );

    repository.transaction(() -> {
        repository.closeRecordedInterval(closedOld);
        repository.insert(corrected);
        projection.apply(corrected);
    });
}

Important: the closure of old assertion and insert of new assertion must be atomic within the target boundary.

16. Overlap Rules

Bitemporal data must control interval overlap.

For a given business key, status intervals may be:

mutually exclusive
overlapping with precedence
overlapping by design because multiple statuses can apply

Do not assume.

Example rules:

Domain	Valid-time overlap allowed?
Case lifecycle primary status	Usually no
Case tags	Yes
Assigned officers	Maybe yes if co-assignment allowed
Risk scores	Usually versioned snapshots
SLA pause intervals	Yes, but must merge/normalize

For primary status:

For each case_id and recorded-time view, valid intervals for primary status must not overlap.

Validation SQL concept:

select a.case_id, a.assertion_id, b.assertion_id
from case_status_bitemporal a
join case_status_bitemporal b
  on a.case_id = b.case_id
 and a.assertion_id <> b.assertion_id
 and a.recorded_to = timestamp '9999-12-31 00:00:00'
 and b.recorded_to = timestamp '9999-12-31 00:00:00'
 and a.valid_from < b.valid_to
 and b.valid_from < a.valid_to;

This detects overlapping currently accepted valid intervals.

17. Correction Types

Not all corrections mean the same thing.

Type	Meaning	Pipeline behavior
Field correction	Payload field wrong	Supersede assertion
Effective-date correction	Valid-time boundary wrong	Recompute downstream affected windows
Retraction	Fact should not exist	Close/retract assertion
Late assertion	Fact was true earlier but recorded late	Add assertion with old valid time, new recorded time
Legal restatement	Accepted historical truth changed	Publish restatement evidence
Source duplicate	Same assertion repeated	Dedupe, no correction
Source compensation	Business action reverses prior fact	New fact, not necessarily correction

Do not model every negative event as a correction.

Example:

Case reopened after closure

This may be a new lifecycle event, not a correction of closure.

Correction changes the claim about what was true.

Compensation records a new fact that reverses or offsets another fact.

18. Retraction Pattern

A retraction says:

The previous assertion should no longer be considered valid truth.

Retraction event:

{
  "eventType": "CaseStatusRetracted",
  "caseId": "C-100",
  "retractsAssertionId": "assertion-777",
  "reason": "SOURCE_ENTRY_ERROR",
  "recordedAt": "2026-04-10T09:00:00Z"
}

Ledger behavior:

close old assertion in recorded time
optionally insert a tombstone assertion or retraction assertion
downstream projections remove or recompute state

For audit, a retraction should still be visible historically.

19. Restatement Pattern

A restatement is a published replacement of previously accepted derived output.

Example:

March 2026 enforcement SLA report is restated after correction batch.

Restatement metadata:

{
  "restatementId": "rst-2026-04-sla-001",
  "supersedesReportRunId": "report-2026-04-05-001",
  "reason": "Late effective-date corrections received on 2026-04-10",
  "validPeriod": "2026-03",
  "recordedAsOf": "2026-04-12T00:00:00Z",
  "producedByRunId": "bf-2026-04-12-sla-restatement-001"
}

Restatement should not pretend the old report never existed.

It should say:

This newer output supersedes that older output.

20. Bitemporal Pipeline Flow

For each source event:

parse -> classify -> derive assertion -> check duplicates -> resolve correction -> write ledger -> update projections -> validate

21. Computing Impacted Windows

A correction can affect many downstream windows.

Example:

Effective date changes from April 1 to March 30.

Impacted outputs:

daily status for March 30, March 31, April 1
monthly March aggregate
monthly April aggregate
SLA breach windows
jurisdictional report
feature store snapshot

Impact function:

public interface CorrectionImpactAnalyzer<E> {
    Set<OutputPartition> impactedPartitions(E correction);
}

Example:

public Set<OutputPartition> impactedPartitions(CaseStatusCorrection correction) {
    LocalDate oldDate = correction.oldValidFrom().atZone(zone).toLocalDate();
    LocalDate newDate = correction.newValidFrom().atZone(zone).toLocalDate();

    return DateRange.closed(min(oldDate, newDate), max(oldDate, newDate).plusDays(1))
            .stream()
            .flatMap(date -> Stream.of(
                    OutputPartition.daily(date),
                    OutputPartition.monthly(YearMonth.from(date))
            ))
            .collect(Collectors.toSet());
}

Corrections should trigger targeted restatement, not always global recompute.

22. Bitemporal Joins

Joining two historical datasets requires choosing the time semantics.

Example:

Case status joined to jurisdiction calendar.

Possible joins:

Join type	Meaning
Current reference join	Use current accepted calendar
Valid-time join	Use calendar valid at case effective date
Transaction-time join	Use calendar version known at report run time
Bitemporal join	Use calendar valid at business time and known at recorded time

If reports must be reproducible, use bitemporal join.

Pseudo-condition:

case.valid_from >= calendar.valid_from
and case.valid_from < calendar.valid_to
and report.recorded_as_of >= calendar.recorded_from
and report.recorded_as_of < calendar.recorded_to

This prevents accidentally using a future-corrected calendar to explain a past report unless that is the intended restatement mode.

23. Truth Modes

A mature platform exposes truth modes explicitly.

Truth mode	Meaning
`CURRENT_ACCEPTED`	Latest accepted understanding
`AS_REPORTED`	What was published at report time
`AS_KNOWN_AT`	What system knew at recorded time
`AS_EFFECTIVE_AT`	Facts valid at business time using current knowledge
`REVISED_TRUTH`	Restated/corrected truth after accepted corrections
`SOURCE_OBSERVED`	Raw source assertion, no correction collapse

Java enum:

public enum TruthMode {
    CURRENT_ACCEPTED,
    AS_REPORTED,
    AS_KNOWN_AT,
    AS_EFFECTIVE_AT,
    REVISED_TRUTH,
    SOURCE_OBSERVED
}

Do not let consumers query “history” without specifying truth mode.

24. Current Projection from Bitemporal Ledger

A current projection is a convenience view.

create view current_case_status as
select *
from case_status_bitemporal s
where s.recorded_to = timestamp '9999-12-31 00:00:00'
  and s.valid_from <= current_timestamp
  and current_timestamp < s.valid_to;

But be careful with current_timestamp in materialized outputs. It makes results time-dependent.

For reproducible reports, parameterize time:

where s.valid_from <= :valid_as_of
  and :valid_as_of < s.valid_to
  and s.recorded_from <= :recorded_as_of
  and :recorded_as_of < s.recorded_to

25. Bitemporal in Lakehouse Tables

Lakehouse formats with snapshots help with transaction-time publication, but they do not automatically solve valid-time modeling.

You still need columns such as:

valid_from
valid_to
recorded_from
recorded_to
assertion_id
supersedes_assertion_id

Table snapshots answer:

What files/rows were in the table at snapshot N?

Bitemporal columns answer:

What business facts were valid at time Y and known at time X?

These are complementary.

A lakehouse snapshot may represent publication time. A bitemporal ledger represents domain/system knowledge time.

26. Kafka Topics for Corrections

Topic design options:

26.1 Same canonical event topic

case-events-v1

Contains both facts and corrections.

Pros:

preserves order by key
consumers see all state-changing facts

Cons:

consumers must understand correction semantics

26.2 Dedicated correction topic

case-events-v1
case-corrections-v1

Pros:

clear operational visibility

Cons:

ordering across topics is harder
consumers must join streams

26.3 Assertion ledger topic

case-status-assertions-v1

Contains normalized bitemporal assertions.

This is often cleaner for downstream analytics.

Key rule:

Partition by business key when ordering corrections relative to original assertions matters.

27. Ordering and Late Corrections

A correction can arrive before the event it corrects in downstream processing due to replay, topic ordering, or source disorder.

Options:

Policy	Behavior
Hold pending correction	Store until original arrives
Resolve by assertion ID	If old assertion missing, query ledger
Emit unresolved correction	Route to quarantine/pending lane
Apply as independent assertion	Dangerous unless semantics allow

Pending correction table:

create table pending_correction (
    correction_event_id varchar primary key,
    target_assertion_id varchar not null,
    case_id varchar not null,
    payload jsonb not null,
    first_seen_at timestamp not null,
    retry_after timestamp not null,
    status varchar not null
);

Do not drop corrections because the original event has not arrived yet.

28. Dedupe for Corrections

Correction events require idempotency.

Dedupe keys:

correction event ID
source command ID
corrected assertion ID + correction sequence
payload hash + source commit time

Avoid deduping only by business key.

Two corrections for the same case may both be valid.

C-100 effective date corrected
C-100 status reason corrected

Same case, different correction.

29. Correction and Aggregates

Aggregates are where corrections become painful.

Suppose case was counted in April but correction moves it to March.

The aggregate update is not simply:

March +1

It may be:

April -1
March +1

If original contribution is known, use contribution ledger.

create table aggregate_contribution_ledger (
    contribution_id varchar primary key,
    aggregate_name varchar not null,
    aggregate_key varchar not null,
    source_assertion_id varchar not null,
    contribution_value decimal not null,
    recorded_from timestamp not null,
    recorded_to timestamp not null,
    produced_by_run_id varchar not null
);

Then correction means superseding contribution, not guessing the delta.

30. Correction and Materialized Views

A materialized view should be rebuildable from ledger.

Design options:

Option	Use when
Incremental correction update	Low latency needed, correction logic simple
Partition restatement	Reporting tables partitioned by impacted period
Full rebuild	State logic complex or low data volume
Versioned materialization	Audit requires old/new comparison

For regulatory reporting, versioned materialization is often the safest.

31. Correction and State Machines

Case lifecycle pipelines often use state machines.

Corrections can invalidate a previous transition path.

Example:

OPEN -> INVESTIGATING -> CLOSED

Correction says INVESTIGATING effective date was earlier.

Effects:

duration in OPEN changes
SLA clock start changes
report period changes
breach detection changes

State machine must be able to recompute over valid-time ordered assertions.

Pattern:

ledger of assertions -> sort by valid time -> replay domain state machine -> produce versioned projection

Do not only patch final state.

32. Java State Machine Rebuild

public final class CaseLifecycleRebuilder {
    public CaseLifecycleProjection rebuild(
            CaseId caseId,
            List<CaseLifecycleAssertion> assertions,
            TruthMode truthMode,
            Instant validAsOf,
            Instant recordedAsOf
    ) {
        List<CaseLifecycleAssertion> visible = assertions.stream()
                .filter(a -> visibleUnder(a, truthMode, validAsOf, recordedAsOf))
                .sorted(Comparator
                        .comparing((CaseLifecycleAssertion a) -> a.validTime().fromInclusive())
                        .thenComparing(a -> a.recordedTime().fromInclusive())
                        .thenComparing(a -> a.assertionId().value()))
                .toList();

        CaseLifecycleState state = CaseLifecycleState.initial(caseId);
        for (CaseLifecycleAssertion assertion : visible) {
            state = state.apply(assertion);
        }
        return state.toProjection();
    }
}

Sorting is not cosmetic. It is part of deterministic correctness.

33. Auditing Corrections

Every correction should answer:

Question	Evidence
What was corrected?	`supersedes_assertion_id` / `corrects_event_id`
Why?	correction reason
Who/what caused it?	causation ID, actor, source command
When did it become known?	recorded time
What business period changed?	valid time range
What outputs were impacted?	impact analysis result
What restatements were published?	restatement metadata
What old output was superseded?	superseded run/report ID

A correction without reason is weak evidence.

34. Data Quality Rules for Bitemporal Tables

Required checks:

valid interval non-empty
recorded interval non-empty
no overlap for mutually exclusive facts
every correction references an existing assertion or is pending/quarantined
every closed recorded interval has a superseding/retraction reason
no assertion uses processing time as valid time unless explicitly allowed
output lineage present
source event ID present
duplicate assertion ID rejected
current projection matches ledger query

Example validation:

public final class BitemporalValidator {
    public List<Violation> validate(CaseStatusAssertion assertion) {
        List<Violation> violations = new ArrayList<>();

        if (!assertion.validTime().fromInclusive().isBefore(assertion.validTime().toExclusive())) {
            violations.add(new Violation("VALID_TIME_EMPTY"));
        }

        if (!assertion.recordedTime().fromInclusive().isBefore(assertion.recordedTime().toExclusive())) {
            violations.add(new Violation("RECORDED_TIME_EMPTY"));
        }

        if (assertion.sourceEventId() == null) {
            violations.add(new Violation("MISSING_SOURCE_EVENT_ID"));
        }

        return violations;
    }
}

35. Bitemporal and Backfill

Backfill and bitemporal design are tightly connected.

Backfill can operate in different truth modes.

Backfill mode	Meaning
Current accepted rebuild	Use latest corrections
As-known-at rebuild	Reproduce what would have been produced at past recorded time
Restatement rebuild	Produce corrected output and supersede old output
Source-observed rebuild	Rebuild exactly from raw assertions without correction collapse

Manifest must say truth mode.

{
  "runId": "bf-2026-04-sla-restatement-001",
  "truthMode": "REVISED_TRUTH",
  "validRange": {
    "from": "2026-03-01",
    "to": "2026-04-01"
  },
  "recordedAsOf": "2026-04-12T00:00:00Z"
}

Without truth mode, a backfill is ambiguous.

36. Regulatory Reporting Pattern

A defensible regulatory report should store:

report run ID
report period
valid-time range
recorded-as-of time
source snapshots
transform version
reference data version
input counts
output counts
corrections included
restatements superseded
approver

Report output table:

create table regulatory_report_case_sla (
    report_run_id varchar not null,
    report_period varchar not null,
    jurisdiction varchar not null,
    breach_count bigint not null,
    open_case_count bigint not null,
    valid_from date not null,
    valid_to date not null,
    recorded_as_of timestamp not null,
    produced_by_run_id varchar not null,
    supersedes_report_run_id varchar null,
    primary key (report_run_id, jurisdiction)
);

This enables:

Show me March report as filed on April 5.
Show me March report restated on April 12.
Show me why they differ.

37. Correction Impact Diff

For every restatement, produce a diff summary.

Example:

{
  "oldReportRunId": "report-2026-04-05-001",
  "newReportRunId": "report-2026-04-12-001",
  "period": "2026-03",
  "differences": [
    {
      "metric": "sla_breach_count",
      "jurisdiction": "JKT",
      "oldValue": 182,
      "newValue": 189,
      "delta": 7
    }
  ],
  "causes": [
    {
      "correctionReason": "EFFECTIVE_DATE_CORRECTION",
      "count": 9
    },
    {
      "correctionReason": "LATE_CASE_CLOSURE",
      "count": 3
    }
  ]
}

This is more useful than telling stakeholders “the pipeline was fixed.”

38. Anti-Patterns

Anti-pattern: single `updated_at` for all time semantics

updated_at cannot answer valid-time and transaction-time questions.

Anti-pattern: overwriting corrections in place

You lose evidence of prior belief.

Anti-pattern: correction as delete + insert with no link

You cannot explain lineage.

Anti-pattern: using processing time as effective time

Pipeline delay changes business truth.

Anti-pattern: current reference join for historical report reproduction

You may use knowledge that was not available at report time.

Anti-pattern: treating all late data as duplicate

Late data may be a legitimate old-valid-time assertion.

Anti-pattern: no truth mode in consumer API

Consumers unknowingly mix current truth, historical belief, and restated truth.

39. Testing Bitemporal Pipelines

39.1 As-known-at test

@Test
void queryReturnsOldBeliefBeforeCorrectionRecorded() {
    var oldAssertion = assertion(
            validFrom("2026-03-31"),
            recordedFrom("2026-04-02"),
            recordedTo("2026-04-10")
    );

    var correctedAssertion = assertion(
            validFrom("2026-03-30"),
            recordedFrom("2026-04-10"),
            recordedTo(INFINITY)
    );

    var result = query.asKnownAt(
            caseId("C-100"),
            validAt("2026-03-31"),
            recordedAt("2026-04-05")
    );

    assertEquals(oldAssertion, result);
}

39.2 Current accepted test

After correction, current accepted view should use corrected assertion.

39.3 Overlap test

Mutually exclusive facts must not overlap under current recorded view.

39.4 Restatement impact test

A correction moving a fact from April to March should restate both March and April aggregates.

39.5 Reference data time test

Historical report reproduction must not use future reference data unless truth mode permits it.

39.6 Replay determinism test

Ledger replay produces same projection every time.

40. Case Study: Enforcement Lifecycle Corrections

Domain:

cases move through lifecycle states
SLA depends on state, jurisdiction calendar, pauses, escalation level
legal correction can change effective date
reports are submitted monthly

Events:

CaseOpened
CaseAssigned
CaseEscalated
SlaPaused
SlaResumed
CaseDecisionIssued
CaseClosed
CaseStatusCorrected
CaseEffectiveDateCorrected

Pipeline design:

outbox emits canonical lifecycle events
normalizer extracts valid time and recorded time
assertion ledger stores lifecycle assertions
correction resolver supersedes old assertions
current projection serves operational analytics
reporting pipeline generates monthly output with recorded_as_of
restatement pipeline publishes corrected reports with diff evidence

Mermaid view:

Key invariant:

Reports are not overwritten silently. They are superseded by restatements with explicit recorded_as_of and reason.

41. Production Checklist

Before calling a correction pipeline production-grade, verify:

42. The Core Lesson

Bitemporal modeling is not academic decoration.

It is the difference between:

This is the value now.

and:

This is what we believed then, about what was effective then, produced by this run, based on these source assertions, later superseded by this correction for this reason.

For ordinary dashboards, the first may be enough.

For enforcement lifecycle systems, audit trails, financial ledgers, legal decisions, compliance reporting, and regulatory defensibility, the second is often required.

The mature pipeline does not erase history.

It records how truth changed.

Lesson Recap

You just completed lesson 56 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 55

Backfill and Reprocessing

Next Lesson

Lesson 57

Orchestration vs Choreography

Bitemporal and Correction Pipelines

Part 056 — Bitemporal and Correction Pipelines

1. The Problem: Time Is Not One Thing

2. Bitemporal Model in One Sentence

3. Why Bitemporal Matters in Pipelines

4. A Simple Example

5. Bitemporal Dimensions

6. Event Time vs Valid Time

7. Transaction Time vs Processing Time

8. The Correction Principle

9. Bitemporal Table Design

10. Bitemporal Query Patterns

10.1 Current accepted state

10.2 Business truth as of valid time

10.3 What we believed at recorded time

11. Event Model for Corrections

12. Correction Pipeline Architecture

13. Assertion Ledger vs Projection

14. Java Domain Model

15. Bitemporal Write Algorithm

16. Overlap Rules

17. Correction Types

18. Retraction Pattern

19. Restatement Pattern

20. Bitemporal Pipeline Flow

21. Computing Impacted Windows

22. Bitemporal Joins

23. Truth Modes

24. Current Projection from Bitemporal Ledger

25. Bitemporal in Lakehouse Tables

26. Kafka Topics for Corrections

26.1 Same canonical event topic

26.2 Dedicated correction topic

26.3 Assertion ledger topic

27. Ordering and Late Corrections

28. Dedupe for Corrections

29. Correction and Aggregates

30. Correction and Materialized Views

31. Correction and State Machines

32. Java State Machine Rebuild

33. Auditing Corrections

34. Data Quality Rules for Bitemporal Tables

35. Bitemporal and Backfill

36. Regulatory Reporting Pattern

37. Correction Impact Diff

38. Anti-Patterns

Anti-pattern: single updated_at for all time semantics

Anti-pattern: overwriting corrections in place

Anti-pattern: correction as delete + insert with no link

Anti-pattern: using processing time as effective time

Anti-pattern: current reference join for historical report reproduction

Anti-pattern: treating all late data as duplicate

Anti-pattern: no truth mode in consumer API

39. Testing Bitemporal Pipelines

39.1 As-known-at test

39.2 Current accepted test

39.3 Overlap test

39.4 Restatement impact test

39.5 Reference data time test

39.6 Replay determinism test

40. Case Study: Enforcement Lifecycle Corrections

41. Production Checklist

42. The Core Lesson

Anti-pattern: single `updated_at` for all time semantics