Final StretchOrdered learning track

Testing Strategy for Large Scale ERP

Learn Java Large Scale ERP - Part 029

Testing strategy, golden datasets, business invariant tests, scenario regression packs, and failure-oriented quality engineering for large-scale ERP systems built with Java.

27 min read5201 words
PrevNext
Lesson 2934 lesson track2934 Final Stretch
#java#erp#testing#quality-engineering+4 more

Part 029 — Testing Strategy for Large Scale ERP

Core idea: ERP testing is not primarily about checking methods, screens, or controllers. It is about proving that business invariants survive configuration, workflow, concurrency, integration, migration, reporting, and operational failure.

Large-scale ERP quality problems are rarely caused by a single missing unit test. They happen because the test strategy does not match the shape of the system.

A serious ERP system has:

  • long-running business lifecycles;
  • effective-dated configuration;
  • legal and fiscal constraints;
  • multi-entity organizational scope;
  • approval and segregation-of-duties constraints;
  • financial and inventory ledgers;
  • asynchronous integration;
  • batch jobs;
  • reporting/read models;
  • migration and cutover flows;
  • audit and support requirements.

Testing such a system requires more than the classic testing pyramid. The pyramid still matters, but it is insufficient. ERP needs a business invariant testing strategy.

This part focuses on how top engineers design a testing system for ERP correctness, regression resistance, and supportable change.


1. Kaufman Skill Deconstruction

To become effective at large-scale ERP testing, decompose the skill into these sub-skills:

Sub-skillWhat top engineers can do
Invariant discoveryIdentify non-negotiable rules such as balanced postings, stock conservation, period locks, and SoD.
Scenario modellingConvert real business processes into stable, executable regression scenarios.
Golden dataset designBuild canonical test data that represents realistic organization, master data, configuration, and transactions.
Test boundary selectionDecide whether a rule belongs in unit, domain, service, integration, workflow, batch, or end-to-end tests.
Data determinismControl clocks, IDs, currencies, sequences, exchange rates, and effective dates.
Failure modellingTest retries, duplicate messages, partial failures, deadlocks, timeouts, unknown outcomes, and recovery.
Reconciliation testingProve that subledger, GL, inventory, payment, and reporting views reconcile.
Security/control testingVerify permissions, approval authority, SoD, emergency override, and audit evidence.
Test environment engineeringUse realistic dependencies without turning tests into slow, fragile mini-production copies.
Regression governanceKeep the test suite trustworthy as modules, configs, tenants, and localizations grow.

The practical goal is not “high coverage.” The goal is high confidence in business correctness under change.


2. Why ERP Testing Is Different

A normal application often tests whether a request returns the expected response. ERP must test whether the response is legally, financially, operationally, and historically defensible.

2.1 ERP correctness has multiple dimensions

DimensionExample failure
Functional correctnessPurchase order can be approved and released.
Financial correctnessGoods receipt posts to the wrong inventory account.
Temporal correctnessBackdated invoice bypasses period lock.
Organizational correctnessBranch user sees another legal entity's invoice.
Lifecycle correctnessCancelled document can still be posted.
Integration correctnessBank payment confirmation creates duplicate settlement.
Reporting correctnessAR aging report disagrees with AR subledger.
Audit correctnessApproval decision exists but authority context is not captured.
Operational correctnessPosting batch fails halfway and cannot resume safely.
Migration correctnessOpening balances load but do not reconcile to legacy trial balance.

2.2 The wrong test focus creates false confidence

A team can have thousands of tests and still ship broken ERP behavior if tests mostly assert:

  • controller status codes;
  • mapper fields;
  • repository CRUD;
  • happy-path UI clicks;
  • mocked database behavior;
  • snapshots of JSON without business semantics;
  • service methods without realistic configuration;
  • integration events without idempotency and reconciliation.

Those tests may be useful, but they are not enough. ERP testing must focus on state, money, stock, authority, lifecycle, and evidence.


3. The ERP Testing Mental Model

Think of the ERP test strategy as concentric safety layers around business truth.

The core rule:

Every test must answer: which business invariant, risk, or operational guarantee does this protect?

If a test cannot answer that, it may still be useful, but it should not be treated as strategic ERP quality coverage.


4. Testing Pyramid Is Necessary but Not Sufficient

The classic pyramid is still useful:

But ERP needs an additional lens: business criticality.

Test typeERP use
Unit testTax rounding, price allocation, state transition guard, validation rule.
Domain invariant testJournal balance, stock ledger quantity conservation, SoD rule.
Application service testSubmit PO, approve invoice, post GRN, reverse journal.
Repository/integration testLocking, unique constraints, sequence behavior, isolation behavior.
Workflow testApproval routing, delegation, escalation, reassignment, cancellation.
Scenario testProcure-to-pay, order-to-cash, month-end close, return flow.
Contract testAPI/event payload compatibility and consumer assumptions.
Batch testRestartability, chunk checkpoint, duplicate prevention, reconciliation.
Migration testStaging validation, idempotent import, opening balance reconciliation.
Report testSource-to-report reconciliation, period correctness, authorization filter.
Concurrency testReservation race, numbering race, close/post race.
End-to-end testCritical smoke across UI/API/workflow/reporting.

The question is not “which layer is best?” The question is:

What is the cheapest test that gives real confidence for this ERP risk?


5. The ERP Invariant Catalogue

Before writing tests, build an invariant catalogue. It is the testing equivalent of a domain constitution.

5.1 Example invariant catalogue

AreaInvariantExample test
General ledgerEvery posted journal balances per currency and legal entity.Generate journal lines and assert debit equals credit.
Period closeNo posting into closed period except approved adjustment process.Try backdated posting after close.
InventoryStock balance equals sum of immutable stock ledger movements.Post receipt, transfer, issue, return, count adjustment.
ReservationReserved quantity cannot exceed available-to-promise under policy.Race two sales orders for same stock.
Purchase orderPO release requires valid approval authority at decision time.Change approval matrix after decision and assert evidence is stable.
Invoice matchingVendor invoice cannot be paid if matching exception is unresolved.Attempt payment proposal with 3-way mismatch.
TaxTax result must be reproducible from document, config version, and calculation trace.Recalculate using stored config version.
Credit controlOrder release cannot exceed approved credit exposure unless override exists.Simulate unpaid AR and high-value order.
Legal numberingPosted legal number is unique and immutable within sequence scope.Concurrent posting attempts.
AuditMaterial state transition has actor, timestamp, reason, before/after state, and correlation ID.Submit/approve/post and verify audit timeline.
SecurityUser cannot approve own request when SoD forbids it.Maker-checker test.
ReportingAR aging total equals open AR subledger balance for same cutoff.Compare report output to subledger projection.

5.2 Invariant priority

Not all invariants deserve the same test budget.

PriorityCharacteristicsExample
P0Legal, financial, irreversible, or externally visible.Balanced journal, legal invoice number, payment settlement.
P1Operationally severe but recoverable.Stock reservation, workflow escalation, credit exposure.
P2Important but local and correctable.UI warning, default value, optional field validation.
P3Low-risk presentation or convenience behavior.Dashboard sorting preference.

A top-tier ERP test strategy spends disproportionate effort on P0 and P1 invariants.


6. Golden Dataset Architecture

ERP tests need stable data. But “test fixtures” are not enough. You need a golden dataset: a canonical miniature enterprise that contains the minimum realistic data required to exercise business behavior.

6.1 What a golden dataset contains

Dataset sliceExamples
Organizationtenant, legal entities, branches, warehouses, cost centers, profit centers.
Fiscal setupfiscal year, periods, close status, accounting calendar.
Chart of accountsasset, liability, revenue, expense, inventory, tax, control accounts.
Users and rolesrequester, buyer, approver, warehouse operator, accountant, auditor.
Approval matrixthresholds, delegation, SoD constraints, emergency authority.
Master datavendors, customers, items, UOM, tax codes, currencies, exchange rates.
Pricing/tax configprice books, discount rules, tax jurisdictions, rounding policy.
Inventory setupwarehouses, bins, lot/serial policy, valuation policy.
Open balancesGL balances, AR/AP open items, inventory on hand.
Reference documentssample POs, sales orders, invoices, receipts, shipments.

6.2 Golden dataset principles

PrincipleWhy it matters
DeterministicTest outcome should not depend on current date, random IDs, or environment order.
Minimal but realisticAvoid 10,000 fixture rows when 200 well-designed rows cover the domain.
VersionedDataset changes should be reviewed like code.
Named scenariosRows should be traceable to scenarios: P2P_STANDARD, O2C_CREDIT_HOLD, GL_PERIOD_CLOSE.
RebuildableThe dataset should be recreated from source, not manually patched.
Tenant-awareMulti-tenant and multi-entity tests must verify isolation.
Effective-datedConfig and master data should include past/current/future versions.
AuditableGolden data should include expected business meaning, not just SQL inserts.

6.3 Golden dataset layout

src/test/resources/golden-data/
  org/
    tenants.yml
    legal-entities.yml
    branches.yml
    cost-centers.yml
  finance/
    chart-of-accounts.yml
    fiscal-periods.yml
    opening-balances.yml
  security/
    users.yml
    roles.yml
    approval-matrix.yml
    sod-rules.yml
  master-data/
    vendors.yml
    customers.yml
    items.yml
    tax-codes.yml
  scenarios/
    p2p-standard.yml
    p2p-three-way-mismatch.yml
    o2c-credit-hold.yml
    inventory-lot-trace.yml
    gl-period-close.yml

6.4 Golden data anti-pattern

Avoid fixture piles that nobody understands:

insert-001.sql
insert-002.sql
legacy-test-data.sql
more-data.sql
fix-data-again.sql

That style creates accidental tests. A golden dataset creates intentional tests.


7. Scenario DSL for ERP Tests

A good ERP test reads like a business scenario, not like database plumbing.

7.1 Poor test expression

@Test
void testPoFlow() {
    var vendor = vendorRepository.save(new Vendor(...));
    var item = itemRepository.save(new Item(...));
    var po = poRepository.save(new PurchaseOrder(...));
    // 120 lines later...
    assertEquals("POSTED", invoice.getStatus());
}

This is hard to review. It hides intent.

7.2 Better scenario expression

@Test
void standardProcureToPay_postsInventoryAndAPAndReconcilesToGL() {
    scenario
        .givenCompany("ID01")
        .givenVendor("VENDOR-STEEL-01")
        .givenItem("RAW-STEEL", onHand("WH-JKT", 0))
        .givenApprovalMatrix("BUYER_LIMIT_100M")
        .whenRequesterCreatesRequisition(amount("IDR", 50_000_000))
        .andBuyerConvertsToPurchaseOrder()
        .andApproverApproves()
        .andWarehouseReceivesGoods(quantity(100))
        .andApRecordsInvoiceMatchingReceipt()
        .andPaymentRunPaysInvoice()
        .thenPurchaseOrderIsClosed()
        .andInventoryLedgerReconciles()
        .andSubledgerReconcilesToGeneralLedger()
        .andAuditTimelineIsComplete();
}

This test is not “less technical.” It is more precise because the technical mechanics are encapsulated behind domain-level verbs.

7.3 Scenario DSL design rules

RuleExplanation
Use business verbsapproveInvoice, postReceipt, releaseOrder, not callServiceX.
Keep technical escape hatchAllow lower-level assertions for lock, sequence, event, and SQL behavior.
Control timeScenarios must run at a fixed business clock.
Capture IDs semanticallyUse document references, not random IDs in assertions.
Assert business outcomesStatus alone is not enough; assert ledger, stock, audit, and report effects.
Reuse carefullyDSL should improve clarity, not hide different business conditions behind generic helpers.

8. Domain Invariant Tests

Domain invariant tests are fast tests that prove core rules without requiring full infrastructure.

8.1 General ledger balance invariant

final class JournalInvariantTest {

    @Test
    void postedJournalMustBalancePerCurrency() {
        Journal journal = Journal.draft("ID01")
            .debit(account("Inventory"), money("IDR", 1_000_000))
            .credit(account("GRIR"), money("IDR", 1_000_000));

        PostedJournal posted = journal.post(postingContext());

        assertThat(posted.totalDebit("IDR"))
            .isEqualByComparingTo(posted.totalCredit("IDR"));
    }

    @Test
    void unbalancedJournalCannotBePosted() {
        Journal journal = Journal.draft("ID01")
            .debit(account("Inventory"), money("IDR", 1_000_000))
            .credit(account("GRIR"), money("IDR", 999_999));

        assertThatThrownBy(() -> journal.post(postingContext()))
            .isInstanceOf(UnbalancedJournalException.class);
    }
}

8.2 State transition invariant

@ParameterizedTest
@CsvSource({
    "DRAFT, SUBMIT, SUBMITTED",
    "SUBMITTED, APPROVE, APPROVED",
    "APPROVED, POST, POSTED",
    "POSTED, REVERSE, REVERSED"
})
void legalTransitions(String from, String command, String to) {
    DocumentLifecycle lifecycle = DocumentLifecycle.from(State.valueOf(from));

    TransitionResult result = lifecycle.apply(Command.valueOf(command), context());

    assertThat(result.newState()).isEqualTo(State.valueOf(to));
}

@ParameterizedTest
@CsvSource({
    "DRAFT, POST",
    "CANCELLED, APPROVE",
    "POSTED, CANCEL",
    "REVERSED, POST"
})
void illegalTransitionsAreRejected(String from, String command) {
    DocumentLifecycle lifecycle = DocumentLifecycle.from(State.valueOf(from));

    assertThatThrownBy(() -> lifecycle.apply(Command.valueOf(command), context()))
        .isInstanceOf(IllegalTransitionException.class);
}

The point is not the testing library. The point is to make the transition table executable.


9. Application Service Tests

Application service tests prove that use cases enforce the right rules across domain objects, repositories, policies, and transactions.

9.1 What belongs here

Use caseWhat to assert
Submit requisitionlifecycle transition, validation, audit event, approval route creation.
Approve purchase orderauthority snapshot, SoD, state transition, audit evidence.
Post goods receiptstock movement, inventory valuation event, accounting event.
Post vendor invoicematching status, AP open item, subledger entry.
Run payment proposaleligibility, hold status, approval, duplicate protection.
Reverse journalreversal link, opposite entries, period rule, audit event.

9.2 Application service test style

@Test
void approverCannotApproveOwnPurchaseOrder() {
    PurchaseOrderId poId = fixtures.purchaseOrder()
        .requestedBy("alice")
        .amount("IDR", 250_000_000)
        .submitted()
        .id();

    assertThatThrownBy(() -> purchaseOrderService.approve(
        poId,
        ApprovalCommand.by("alice", "looks good")
    )).isInstanceOf(SegregationOfDutiesViolation.class);

    assertThat(auditEvents.forDocument(poId))
        .anySatisfy(event -> {
            assertThat(event.type()).isEqualTo("APPROVAL_REJECTED_BY_CONTROL");
            assertThat(event.reasonCode()).isEqualTo("MAKER_CHECKER_VIOLATION");
        });
}

Notice the test asserts both prevention and evidence.


10. Persistence and Database Integration Tests

ERP systems often rely on database constraints for correctness. You cannot mock those away.

10.1 Database behavior worth testing

BehaviorWhy it matters
Unique constraintsPrevent duplicate legal numbers, idempotency keys, processing records.
Foreign keysPrevent orphan ledger lines, document lines, and audit links.
Check constraintsEnforce non-negative quantities, debit/credit shape, status values.
Isolation behaviorUnderstand what concurrent transactions can see.
LockingPrevent race conditions in stock reservation and period close.
Index strategyPrevent critical queries from degrading under scale.
Sequence behaviorAvoid wrong assumptions about gaps, rollback, and legal numbering.
Materialized viewsValidate refresh and reconciliation semantics.

10.2 Use real infrastructure for infrastructure rules

A repository test against an in-memory database often misses important behavior: locking, isolation, SQL dialect, indexes, constraints, and execution plans.

For ERP-critical persistence rules, use the same database engine family as production in tests. Testcontainers is useful here because it can provision throwaway databases and brokers for integration tests.

@Testcontainers
class LegalNumberRepositoryIT {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:17");

    @Test
    void legalNumberScopeMustBeUnique() {
        legalNumberRepository.reserve("ID01", "SALES_INVOICE", "2026", "INV-000001");

        assertThatThrownBy(() ->
            legalNumberRepository.reserve("ID01", "SALES_INVOICE", "2026", "INV-000001")
        ).isInstanceOf(DataIntegrityViolationException.class);
    }
}

This is not overkill for ERP. This is where many expensive production bugs live.


11. Workflow and Approval Tests

Workflow tests must verify both the path and the control evidence.

11.1 Approval route test

@Test
void highValuePoRequiresFinanceAndDirectorApproval() {
    var po = scenario.givenPurchaseOrder()
        .company("ID01")
        .amount("IDR", 750_000_000)
        .submittedBy("buyer-01")
        .submit();

    ApprovalRoute route = approvalService.routeFor(po.id());

    assertThat(route.steps())
        .extracting(ApprovalStep::role)
        .containsExactly("FINANCE_CONTROLLER", "DIRECTOR");

    assertThat(route.policySnapshot().configVersion())
        .isNotBlank();
}

11.2 Delegation test

@Test
void delegatedApproverMustLeaveDelegationEvidence() {
    fixtures.delegation()
        .from("director-01")
        .to("director-delegate-01")
        .effectiveOn("2026-07-01")
        .reason("travel")
        .create();

    var po = fixtures.purchaseOrder().amount("IDR", 800_000_000).submitted().id();

    approvalService.approve(po, ApprovalCommand.by("director-delegate-01", "approved"));

    assertThat(auditEvents.forDocument(po))
        .anySatisfy(event -> {
            assertThat(event.type()).isEqualTo("APPROVAL_GRANTED");
            assertThat(event.actor()).isEqualTo("director-delegate-01");
            assertThat(event.delegatedFrom()).isEqualTo("director-01");
            assertThat(event.policySnapshot()).contains("delegationId");
        });
}

The test must prove that the system remembers why the decision was valid at that time.


12. Financial Posting Tests

Posting tests are among the most important ERP regression tests.

12.1 Posting scenario matrix

ScenarioRequired assertions
Goods receiptinventory debit, GRIR credit, stock ledger movement, receipt status.
Vendor invoiceAP credit, expense/inventory clearing, tax, invoice status.
PaymentAP debit, cash/bank credit, payment status, settlement link.
Sales shipmentCOGS debit, inventory credit, stock issue.
Sales invoiceAR debit, revenue/tax credit, invoice legal number.
Customer receiptcash debit, AR credit, settlement link.
Reversalopposite entries, reversal reference, period rule, audit reason.

12.2 Posting test example

@Test
void goodsReceiptCreatesBalancedAccountingEventAndStockMovement() {
    var receipt = scenario
        .givenApprovedPurchaseOrder(item("RAW-STEEL"), quantity(100), unitPrice("IDR", 10_000))
        .whenWarehouseReceives(quantity(100));

    AccountingEvent event = accountingEvents.singleFor(receipt.id(), "GOODS_RECEIPT");

    assertThat(event.lines()).containsExactlyInAnyOrder(
        debit("Inventory-RawMaterial", "IDR", 1_000_000),
        credit("GRIR", "IDR", 1_000_000)
    );

    assertThat(event).isBalancedPerCurrency();
    assertThat(stockLedger.balance("RAW-STEEL", "WH-JKT")).isEqualTo(quantity(100));
    assertThat(reconciliation.receiptToAccounting(receipt.id())).isReconciled();
}

12.3 Posting tests must avoid vague assertions

Weak assertion:

assertThat(receipt.status()).isEqualTo(POSTED);

Strong assertion:

assertThat(receipt.status()).isEqualTo(POSTED);
assertThat(stockLedger.movementsFor(receipt.id())).hasSize(1);
assertThat(accountingEvents.forSource(receipt.id())).singleElement().satisfies(AccountingAssertions::isBalanced);
assertThat(auditEvents.forDocument(receipt.id())).containsEvent("GOODS_RECEIPT_POSTED");
assertThat(reconciliation.receiptToStockAndAccounting(receipt.id())).isReconciled();

ERP tests should assert the whole effect, not just the final status.


13. Reconciliation Tests

Reconciliation is not only a production activity. It must be tested.

13.1 Reconciliation test types

ReconciliationTest question
AP subledger to GLDoes AP control account equal open AP subledger?
AR subledger to GLDoes AR control account equal open AR subledger?
Inventory ledger to GLDoes inventory valuation equal inventory account balance?
Payment gateway to bank ledgerAre external confirmations matched or exceptioned?
Stock movement to stock balanceDoes projected balance equal ledger sum?
Report to sourceDoes aging/report output match authoritative subledger cutoff?
Migration to opening balanceDoes imported opening position equal signed-off legacy balance?

13.2 Reconciliation assertion

@Test
void arAgingMustReconcileToOpenArSubledgerAtCutoff() {
    LocalDate cutoff = LocalDate.parse("2026-06-30");

    Money subledgerTotal = arSubledger.openBalance(company("ID01"), cutoff);
    Money reportTotal = arAgingReport.generate(company("ID01"), cutoff).grandTotal();

    assertThat(reportTotal).isEqualByComparingTo(subledgerTotal);
}

Do not only test report formatting. Test report truth.


14. Integration Failure Tests

ERP integration tests must cover duplicate, delayed, out-of-order, and partially failed messages.

14.1 Integration failure matrix

FailureExampleTest expectation
Duplicate messagePayment confirmation delivered twice.Single settlement outcome.
Timeout after side effectBank accepted payment but ERP did not receive response.Unknown outcome becomes reconciliation case.
Out-of-order eventShipment arrives before allocation event.Event is parked or handled deterministically.
Poison messageTax response has invalid structure.Message goes to exception queue with evidence.
Retry stormExternal system unavailable.Backoff, circuit breaker, no duplicate posting.
Contract driftWMS changes field semantics.Contract test fails before production.

14.2 Idempotent consumer test

@Test
void duplicatePaymentConfirmationDoesNotDoubleSettleInvoice() {
    var invoice = fixtures.vendorInvoice().posted().id();
    var confirmation = bankConfirmation(invoice, "BANK-TXN-9988", money("IDR", 25_000_000));

    paymentConsumer.handle(confirmation);
    paymentConsumer.handle(confirmation);

    assertThat(apSubledger.settlementsFor(invoice)).hasSize(1);
    assertThat(processingLedger.entriesFor("BANK-TXN-9988"))
        .extracting(ProcessingEntry::status)
        .containsExactly("PROCESSED", "DUPLICATE_IGNORED");
}

The test asserts business outcome, not merely that a handler returned successfully.


15. Migration and Cutover Tests

Migration tests should be treated as product tests because cutover is often a one-time production event with huge blast radius.

15.1 Migration test layers

LayerWhat to test
Extract validationRequired fields, data types, legacy row counts, source checksums.
Mapping validationcode mapping, UOM conversion, account mapping, entity mapping.
Staging validationduplicate detection, referential integrity, invalid effective dates.
Import idempotencyrerun does not duplicate documents or balances.
Business validationmigrated documents obey target ERP lifecycle rules.
Reconciliationlegacy totals equal target totals by signed-off dimensions.
Cutover rehearsaltiming, sequence, rollback, decision gates, sign-off.

15.2 Opening balance migration test

@Test
void migratedOpeningBalancesMustMatchSignedOffTrialBalance() {
    migrationJob.importOpeningBalances("cutover-2026-07-01");

    TrialBalance legacy = legacyExtract.signedOffTrialBalance("ID01", LocalDate.parse("2026-06-30"));
    TrialBalance target = generalLedger.trialBalance("ID01", LocalDate.parse("2026-06-30"));

    assertThat(target).matchesByAccountAndCurrency(legacy);
    assertThat(target).isBalanced();
    assertThat(migrationEvidence.forRun("cutover-2026-07-01")).hasApproverSignOff();
}

Migration tests are not optional for serious ERP engineering.


16. Report Tests

Reports are often treated as second-class artifacts. In ERP, reports frequently become the primary interface for finance, auditors, managers, and regulators.

16.1 Report test categories

CategoryExample
Semantic correctnessAR aging bucket logic, inventory valuation, trial balance.
Cutoff correctnessAs-of date, period, timezone, document effective date.
Scope correctnesslegal entity, branch, cost center, tenant, security filter.
Freshness correctnesslast projection checkpoint, source lag, refresh timestamp.
Reconciliation correctnessreport total agrees with source-of-truth ledger.
Export correctnessCSV/Excel/PDF fields, encoding, row counts, totals.
Authorization correctnessuser sees only allowed rows/columns.

16.2 Report truth assertion

@Test
void trialBalanceReportMustBalanceAndMatchGeneralLedger() {
    var report = trialBalanceReport.generate(company("ID01"), period("2026-06"));

    assertThat(report.totalDebit()).isEqualByComparingTo(report.totalCredit());
    assertThat(report.accountBalances()).isEqualTo(
        generalLedger.balancesByAccount(company("ID01"), period("2026-06"))
    );
    assertThat(report.metadata().sourceCheckpoint()).isNotNull();
}

A report without metadata about cutoff, source, and checkpoint is hard to defend.


17. Security and Control Tests

ERP access control bugs are not merely security bugs. They can create invalid approvals, unauthorized postings, privacy leaks, and audit failures.

17.1 Control test matrix

ControlTest
Role permissionBuyer can create PO but cannot post AP invoice.
Scope restrictionBranch user cannot access another branch's documents.
SoDRequester cannot approve own purchase.
Threshold authorityManager can approve up to configured amount only.
DelegationDelegate can approve only within effective delegation period.
Emergency accessBreak-glass action is time-bound and audited.
Export controlUser cannot export restricted columns without permission.
Report row-level securityReport output excludes unauthorized legal entities.

17.2 Permission test style

@Test
void branchUserCannotReadOtherBranchInvoiceEvenIfTheyKnowTheId() {
    var invoice = fixtures.salesInvoice()
        .company("ID01")
        .branch("BANDUNG")
        .posted()
        .id();

    assertThatThrownBy(() -> invoiceQueryService.getInvoice(
        invoice,
        userContext("jakarta-branch-user")
    )).isInstanceOf(AccessDeniedException.class);

    assertThat(securityAudit.denialsFor("jakarta-branch-user"))
        .anyMatch(event -> event.resourceId().equals(invoice.toString()));
}

Always assert denial evidence for sensitive actions.


18. Concurrency Tests

Concurrency bugs in ERP often happen at high-value boundaries: stock, numbering, approval, period close, payment, and posting.

18.1 Concurrency scenarios worth testing

ScenarioRisk
Two orders reserve last stockOverselling.
Two invoices request same legal numberDuplicate legal document number.
Period closes while posting runsPosting into closed period.
Approver and requester change same documentInvalid lifecycle transition.
Payment retry and bank confirmation raceDuplicate settlement or wrong status.
Batch job and manual correction overlapInconsistent ledger/report projection.

18.2 Concurrent stock reservation test

@Test
void concurrentReservationsCannotExceedAvailableStock() throws Exception {
    fixtures.stock("ITEM-01", "WH-JKT", quantity(10));

    ExecutorService pool = Executors.newFixedThreadPool(2);
    CountDownLatch start = new CountDownLatch(1);

    Callable<ReservationResult> reserveSix = () -> {
        start.await();
        return reservationService.reserve("ITEM-01", "WH-JKT", quantity(6));
    };

    Future<ReservationResult> r1 = pool.submit(reserveSix);
    Future<ReservationResult> r2 = pool.submit(reserveSix);

    start.countDown();

    List<ReservationResult> results = List.of(r1.get(), r2.get());

    assertThat(results).filteredOn(ReservationResult::accepted).hasSize(1);
    assertThat(results).filteredOn(result -> !result.accepted()).hasSize(1);
    assertThat(stockProjection.available("ITEM-01", "WH-JKT")).isEqualTo(quantity(4));
}

This type of test is not perfectly deterministic unless the service boundary and database constraints are designed well. That is part of the value: bad concurrency design is hard to test cleanly.


19. Performance Regression Tests

ERP performance tests should be scenario-based, not just endpoint-based.

19.1 Performance scenarios

ScenarioMetric
Month-end close posting batchjournals/minute, lock wait, failure recovery time.
MRP runitem-location combinations/minute, memory usage, planning latency.
AR aging reportresponse time by data volume and cutoff.
Bulk invoice importrows/minute, validation error throughput.
Payment proposalinvoices evaluated/minute, duplicate prevention.
Stock availability queryp95 latency under order entry load.
Dashboard burstdatabase load, cache hit ratio, projection lag.

19.2 Performance gate example

Scenario: Trial balance report for one legal entity with 5 years of postings
Dataset: 20M journal lines, 2K accounts, 12 fiscal periods/year
Gate:
  - p95 response <= 3 seconds for precomputed read model
  - no sequential scan on journal_line
  - result reconciles to GL balance snapshot
  - query does not block posting transactions

Performance gates should include correctness gates. A fast wrong report is still wrong.


20. Test Data Builders vs Object Mothers vs Fixtures

20.1 Trade-off table

TechniqueStrengthWeaknessERP guidance
Object MotherEasy reuse.Becomes giant implicit fixture.Use only for tiny value objects.
Test Data BuilderExpressive and composable.Can hide invalid defaults.Good for domain objects and documents.
Scenario FixtureBusiness-readable setup.Can become too magical.Good for workflows and cross-module scenarios.
Golden DatasetRealistic shared baseline.Needs governance.Essential for ERP regression.
SQL FixturePrecise and fast.Bypasses domain rules.Use for infrastructure and read-model tests, not primary business setup.

20.2 Builder design

PurchaseOrderBuilder po = purchaseOrder()
    .company("ID01")
    .branch("JKT")
    .vendor("VENDOR-STEEL-01")
    .line("RAW-STEEL", quantity(100), unitPrice("IDR", 10_000))
    .requestedBy("buyer-01")
    .submittedOn("2026-07-01");

A builder should make invalid state visible. Avoid defaulting critical business facts silently.

Bad:

PurchaseOrder po = purchaseOrder().build(); // Which company? Which period? Which currency? Which approval policy?

Better:

PurchaseOrder po = purchaseOrder()
    .company("ID01")
    .currency("IDR")
    .fiscalPeriod("2026-07")
    .approvalPolicy("P2P_STANDARD_ID01")
    .build();

21. Determinism: Clock, IDs, Sequences, Currency

ERP tests fail when hidden nondeterminism leaks into scenarios.

21.1 Determinism checklist

ConcernControl
Current timeInject BusinessClock, do not call LocalDate.now() directly.
TimezoneTest cutoff behavior with explicit zone.
IDsUse semantic test references or deterministic ID generator.
Legal numberUse test sequence scope and assert uniqueness/immutability.
CurrencyUse fixed exchange rates and rounding policy.
ConfigUse explicit config version/effective date.
AsyncUse deterministic scheduler or await observable state with timeout.
BatchUse explicit run ID and checkpoint state.
ReportsUse explicit cutoff and projection checkpoint.

21.2 Business clock

public interface BusinessClock {
    LocalDate businessDate();
    Instant instant();
    ZoneId zone();
}

In tests:

businessClock.freezeAt("2026-07-01T09:00:00+07:00");

This is especially important for:

  • period close;
  • effective-dated configuration;
  • delegation;
  • tax rate changes;
  • exchange rates;
  • SLA escalation;
  • report cutoffs.

22. Contract Tests

Large ERP landscapes depend on contracts between modules and systems.

22.1 Contract targets

ContractExample
API contractPOST /vendor-invoices requires idempotency key and company scope.
Event contractGoodsReceiptPosted contains receipt ID, company, item, quantity, source document.
File contractBank statement import file has mandatory transaction reference.
Report contractExport schema for statutory tax report.
Extension contractPlugin hook must return deterministic adjustment lines.
Read model contractAR aging view exposes cutoff, checkpoint, and amount fields.

22.2 Consumer-oriented question

A good contract test asks:

If provider changes this field, does a real consumer break?

Not:

Does provider still produce a JSON that matches its own current code?


23. Mutation and Negative Testing

ERP tests must prove forbidden behavior is forbidden.

23.1 Negative test examples

AreaNegative test
GLunbalanced journal cannot post.
Inventoryissue cannot exceed available quantity when policy forbids negative stock.
Approvalrequester cannot approve own request.
Periodinvoice cannot post into closed period.
Paymentheld invoice cannot be selected for payment.
Integrationduplicate event cannot create duplicate business effect.
Reportunauthorized user cannot export restricted report.
Migrationinvalid account mapping rejects row before import.

A test suite with only happy paths is not an ERP safety net.


24. Regression Pack Governance

ERP regression suites become slow and noisy unless governed.

24.1 Regression pack tiers

TierRun cadenceContent
Commit gateevery commit/PRfast unit, invariant, service tests.
Integration gatePR or merge queueDB/broker integration, contract, critical scenario tests.
Nightlynightlyfull scenario pack, reports, migration dry run, concurrency tests.
Release candidatebefore releasefull regression, performance gates, security/control tests, cutover rehearsal.
Production smokepost-deploysynthetic non-destructive checks and monitoring assertions.

24.2 Flaky test policy

A flaky ERP test is a production risk signal. Do not normalize flakiness.

Flaky sourceResponse
uncontrolled timeinject/freeze clock.
async raceobserve durable state, not sleep.
shared mutable fixtureisolate tenant/company/run ID.
environment driftcontainerize dependency or pin version.
random dataseed generator and log seed.
order dependencyreset database/schema or isolate scenario.
real external dependencyreplace with contract stub or sandbox with deterministic behavior.

Never mark a critical invariant test as “ignore” without a replacement control.


25. Test Observability

Tests should generate useful diagnostics when they fail.

25.1 Diagnostic payload

For ERP scenario tests, failure output should include:

  • scenario ID;
  • tenant/company/branch;
  • business date;
  • config versions;
  • document IDs and numbers;
  • lifecycle state;
  • ledger lines;
  • stock movements;
  • workflow tasks;
  • emitted events;
  • audit timeline;
  • reconciliation differences;
  • relevant correlation IDs.

25.2 Failure message example

Poor:

Expected 100 but was 99

Better:

Inventory reconciliation failed
Scenario: P2P_STANDARD_RECEIPT
Company: ID01
Item: RAW-STEEL
Location: WH-JKT
Expected balance from stock ledger: 100
Projected balance: 99
Missing movement source: GRN-2026-000128
Last projection checkpoint: 2026-07-01T09:10:12+07:00
Correlation ID: test-P2P-009128

Good failure diagnostics shorten incident-style debugging during development.


26. What Not to Test

Do not test everything with equal effort.

26.1 Low-value tests

TestWhy it is weak
Getter/setter testsNo business confidence.
Mocked repository CRUD testsOften assert mocks, not persistence behavior.
UI-only workflow testsSlow and fragile; miss backend invariant details.
Snapshot-only JSON testsDetect shape change but not business correctness.
Massive end-to-end duplicationSlow suite that teams stop trusting.
Tests depending on current dateFail unpredictably.
Tests with production-like data copied blindlyPrivacy risk and poor scenario clarity.

This does not mean UI tests or snapshots are useless. It means they should not be the backbone of ERP quality.


27. Anti-Patterns

Anti-patternWhy it fails
Coverage theater90% coverage with no posting, reconciliation, or SoD confidence.
Mocked ERP realityMocking DB, broker, workflow, and time hides the bugs that matter.
One giant E2E suiteSlow, flaky, hard to diagnose, and expensive to maintain.
Fixture swampNobody understands the test data, so failures become archaeology.
Happy-path worshipNegative and failure paths are where ERP correctness lives.
Report pixel testingFormatting passes while totals are wrong.
No migration testsCutover becomes a heroic manual event.
No control testsApproval/security defects discovered by auditors or production users.
No reconciliation assertionsSubledger and GL drift silently.
Flaky tests toleratedTeams stop trusting the regression gate.

28. Source Notes

This material is designed for ERP engineering practice, not as a wrapper around one framework.

Relevant baseline references:

  • JUnit User Guide: comprehensive reference for writing tests on the JUnit Platform, including Jupiter programming model and parameterized testing concepts.
  • Testcontainers for Java: Java testing library for running lightweight, throwaway dependencies such as databases, message brokers, and browsers in containers.
  • Jakarta Batch: enterprise Java batch processing model with chunk-oriented processing and checkpoint/restart semantics.
  • Jakarta Persistence: standard persistence model for Java enterprise applications.
  • Spring Boot testing and Actuator ecosystem: common baseline for modern Java enterprise services.

The ERP-specific strategy in this part adds the missing layer: business invariant, reconciliation, migration, control, and operational failure testing.


29. Kaufman 20-Hour Practice Plan

Hour 1-3: Build invariant catalogue

Pick one ERP slice, such as procure-to-pay.

Create a table of:

  • document lifecycle invariants;
  • financial posting invariants;
  • approval/SoD invariants;
  • stock/inventory invariants;
  • reporting/reconciliation invariants;
  • integration failure invariants.

Hour 4-6: Build golden dataset

Create minimal golden data:

  • one legal entity;
  • two branches;
  • one warehouse;
  • one vendor;
  • one item;
  • one chart of accounts;
  • one approval matrix;
  • one fiscal period;
  • one tax code.

Hour 7-9: Write domain invariant tests

Test:

  • balanced journal;
  • illegal lifecycle transition;
  • SoD violation;
  • tax rounding;
  • stock movement sign rules.

Hour 10-12: Write scenario tests

Implement:

  • standard P2P;
  • P2P with invoice mismatch;
  • goods receipt reversal;
  • payment duplicate prevention.

Hour 13-15: Write reconciliation tests

Test:

  • AP subledger to GL;
  • stock ledger to inventory balance;
  • AR aging to AR subledger.

Hour 16-18: Write failure tests

Test:

  • duplicate message;
  • retry after timeout;
  • period close race;
  • concurrent reservation;
  • idempotent batch restart.

Hour 19-20: Review and refactor

Ask:

  • Which P0 invariant has no test?
  • Which test is slow but low-value?
  • Which scenario hides too much setup?
  • Which failure would be hard to diagnose?
  • Which report can produce wrong totals without failing a test?

30. Design Review Checklist

Use this checklist when reviewing ERP test strategy.

Invariants

  • Are P0 and P1 business invariants explicitly catalogued?
  • Are financial postings tested for balance and reconciliation?
  • Are stock movements tested through ledger and projection?
  • Are lifecycle transitions tested for legal and illegal paths?
  • Are period locks and effective dates tested?

Golden data

  • Is golden data deterministic and versioned?
  • Does it include organization, fiscal, security, master, config, and transactional slices?
  • Can it be rebuilt from source?
  • Is it understandable by business scenario name?

Scenarios

  • Do tests read like business scenarios?
  • Do scenario tests assert downstream effects, not only status?
  • Are negative cases represented?
  • Are audit events and evidence asserted for material controls?

Integration and failure

  • Are duplicate, delayed, out-of-order, and poison messages tested?
  • Are idempotency keys and processing ledgers tested?
  • Are retries and unknown outcomes tested?
  • Are reconciliation cases tested?

Migration and reports

  • Are migration imports idempotent?
  • Are opening balances reconciled?
  • Are reports tested against authoritative source totals?
  • Are report security filters tested?

Operations

  • Are batch restart and checkpoint behavior tested?
  • Are performance gates scenario-based?
  • Are concurrency races tested for high-risk boundaries?
  • Do test failures include diagnostic business context?

31. Summary

Large-scale ERP testing is a discipline of business-risk reduction.

The central ideas:

  • Do not optimize for coverage alone. Optimize for invariant confidence.
  • Build a golden dataset that represents a realistic miniature enterprise.
  • Express tests in business scenario language.
  • Assert ledger, stock, workflow, audit, report, and reconciliation effects.
  • Test negative paths and operational failures.
  • Use real infrastructure when infrastructure behavior is part of correctness.
  • Treat migration, reporting, and control tests as first-class product tests.
  • Govern the regression suite as an engineering asset.

A top ERP engineer does not ask, “Do we have tests?”

They ask:

Which business truth can still break without a test failing?

That question drives serious ERP quality engineering.

Lesson Recap

You just completed lesson 29 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.