Build CoreOrdered learning track

PL/pgSQL for Production Logic

Learn Production Grade Contract-First Java Orchestration Platform - Part 022

PL/pgSQL production-grade untuk regulatory case platform: function contract, state transition, audit append, outbox creation, lock discipline, exception mapping, security, migration safety, dan testing.

13 min read2483 words
PrevNext
Lesson 2240 lesson track0922 Build Core
#postgresql#plpgsql#stored-functions#database-contract+8 more

Part 022 — PL/pgSQL for Production Logic

PL/pgSQL is not a place to hide business logic because the Java code became messy.

PL/pgSQL is a tool for logic that must execute near the data, inside the same transaction, with the database as the final consistency boundary.

Used badly, PL/pgSQL becomes an invisible application server.

Used well, it becomes a small, sharp layer for:

  • enforcing state transitions under lock,
  • appending audit records atomically,
  • creating outbox events with the same commit as domain mutation,
  • translating constraint failures into stable SQLSTATE/domain errors,
  • implementing idempotent command persistence,
  • reducing race windows,
  • protecting data during migration and repair.

This part explains when to use PL/pgSQL, how to design function contracts, and how to integrate it with Java, MyBatis, Kafka, and Camunda 7 without turning the database into a black box.


1. The Boundary Rule

A production system should have a simple rule:

Java owns orchestration and intent.
PostgreSQL owns durable consistency.
PL/pgSQL owns small atomic operations that are safer near the data.

Do not use PL/pgSQL for:

  • HTTP request handling,
  • calling external services,
  • Kafka publishing,
  • Camunda API calls,
  • authorization policy that depends on external identity provider,
  • large application workflows,
  • business processes that need human explanation outside database context.

Use PL/pgSQL for:

  • atomic domain mutation,
  • state transition with expected version,
  • append-only audit write,
  • durable outbox insertion,
  • idempotency record creation/update,
  • sequence/reference generation,
  • data repair scripts,
  • migration helpers,
  • invariant-preserving bulk operations.

The function should be a transactional primitive, not a hidden feature module.


2. Why PL/pgSQL Exists in This Architecture

Our platform uses these boundaries:

The key property:

Case mutation, audit append, and outbox obligation commit together.

If Java updates the case and then separately inserts audit/outbox records, every boundary becomes a failure point:

case updated -> JVM crash before audit
case updated -> audit inserted -> crash before outbox
case updated -> outbox inserted -> duplicate audit on retry

PL/pgSQL can group the durable mutation as one database contract.

Kafka publishing still happens outside the transaction via the outbox publisher.


3. Function Contract Design

A PL/pgSQL function is a contract.

It needs the same discipline as an HTTP or Kafka contract.

A function contract must define:

Contract elementExample
Namecase_core.transition_case_status
Command intentmove case from one status to another
Required inputcase id, tenant id, expected version, target status, actor, reason
Authorization assumptioncaller has already authenticated and checked high-level permission
Invariant enforcedexpected status/version, valid transition, audit append, outbox event
Return modelnew version, previous status, new status, event id
Error modelstable SQLSTATE and message/detail/hint
Side effectsupdate case, insert audit, insert outbox
Idempotencyexternal or internal depending on command
Lockingrow-level lock on target case

Avoid ambiguous function names:

-- weak
perform_case_update(...)
process_case(...)
save_case(...)

Prefer intent-revealing names:

case_core.transition_case_status(...)
case_core.append_case_evidence(...)
case_core.record_case_decision(...)
integration.claim_outbox_batch(...)
integration.mark_outbox_published(...)

4. Function Input Style

There are two common styles.

4.1 Many Scalar Parameters

CREATE FUNCTION case_core.transition_case_status(
  p_tenant_id text,
  p_case_id uuid,
  p_expected_version integer,
  p_target_status text,
  p_actor_id text,
  p_reason text,
  p_correlation_id text
)
RETURNS TABLE (
  case_id uuid,
  previous_status text,
  new_status text,
  new_version integer,
  outbox_event_id uuid
)
LANGUAGE plpgsql
AS $$
BEGIN
  -- implementation
END;
$$;

Good for stable, small command contracts.

Pros:

  • explicit,
  • type-safe,
  • easy to map with MyBatis,
  • easy to document,
  • easy to detect breaking changes.

Cons:

  • signature changes when command evolves,
  • too many parameters can become noisy.

4.2 JSONB Command Parameter

CREATE FUNCTION case_core.transition_case_status(p_command jsonb)
RETURNS jsonb
LANGUAGE plpgsql
AS $$
BEGIN
  -- implementation
END;
$$;

Useful for flexible migration or internal admin tooling.

But it hides type checking and can make compatibility worse.

For production application commands, prefer scalar parameters or strongly typed composite types unless there is a clear reason.


5. Stable Error Contract

Do not throw random text from PL/pgSQL and parse it in Java.

Use stable SQLSTATE plus structured message/detail/hint.

Example:

RAISE EXCEPTION USING
  ERRCODE = 'P0001',
  MESSAGE = 'CASE_VERSION_CONFLICT',
  DETAIL = format('case_id=%s expected_version=%s', p_case_id, p_expected_version),
  HINT = 'Reload the case and retry the command with the latest version.';

PostgreSQL has standard SQLSTATE codes. For domain-specific errors, many teams use user-defined exception code patterns such as P0001 with stable messages.

In Java, map errors by SQLSTATE and message code:

public DomainError map(SQLException e) {
    String sqlState = e.getSQLState();
    String message = e.getMessage();

    if ("23505".equals(sqlState)) {
        return DomainError.conflict("UNIQUE_CONSTRAINT_VIOLATION");
    }
    if ("P0001".equals(sqlState) && message.contains("CASE_VERSION_CONFLICT")) {
        return DomainError.conflict("CASE_VERSION_CONFLICT");
    }
    if ("P0001".equals(sqlState) && message.contains("INVALID_CASE_TRANSITION")) {
        return DomainError.conflict("INVALID_CASE_TRANSITION");
    }
    return DomainError.internal("DATABASE_ERROR");
}

Better yet, standardize the function error message format so Java does not rely on fragile full-text parsing.


6. The Main Example: Transition Case Status

We will implement one production-grade function:

case_core.transition_case_status

It must:

  1. lock the case row,
  2. check tenant boundary,
  3. check expected version,
  4. check allowed transition,
  5. update case state,
  6. append audit log,
  7. insert outbox event,
  8. return mutation result.

6.1 Supporting Table: Transition Rule

Hardcoding every transition in PL/pgSQL can work for small systems, but a rule table is easier to inspect and migrate.

CREATE TABLE case_core.case_status_transition_rule (
  from_status text NOT NULL,
  to_status text NOT NULL,
  transition_code text NOT NULL,
  requires_reason boolean NOT NULL DEFAULT false,
  is_active boolean NOT NULL DEFAULT true,
  PRIMARY KEY (from_status, to_status)
);

INSERT INTO case_core.case_status_transition_rule
(from_status, to_status, transition_code, requires_reason)
VALUES
('DRAFT', 'INTAKE_VALIDATION', 'SUBMIT_FOR_INTAKE', false),
('INTAKE_VALIDATION', 'UNDER_ASSESSMENT', 'ACCEPT_INTAKE', false),
('INTAKE_VALIDATION', 'CLOSED', 'REJECT_INTAKE', true),
('UNDER_ASSESSMENT', 'UNDER_INVESTIGATION', 'OPEN_INVESTIGATION', false),
('UNDER_INVESTIGATION', 'PENDING_DECISION', 'COMPLETE_INVESTIGATION', false),
('PENDING_DECISION', 'DECIDED', 'RECORD_DECISION', false),
('DECIDED', 'CLOSED', 'CLOSE_CASE', false),
('CLOSED', 'REOPENED', 'REOPEN_CASE', true),
('REOPENED', 'UNDER_INVESTIGATION', 'RESUME_INVESTIGATION', false);

This table is not a replacement for BPMN.

It is the domain state transition guard. BPMN orchestration may trigger commands, but the database still protects the domain state.

6.2 Result Type

A named result type makes MyBatis mapping clearer.

CREATE TYPE case_core.case_transition_result AS (
  case_id uuid,
  previous_status text,
  new_status text,
  previous_version integer,
  new_version integer,
  audit_id uuid,
  outbox_event_id uuid
);

6.3 Function Implementation

CREATE OR REPLACE FUNCTION case_core.transition_case_status(
  p_tenant_id text,
  p_case_id uuid,
  p_expected_version integer,
  p_target_status text,
  p_actor_id text,
  p_reason text,
  p_correlation_id text
)
RETURNS case_core.case_transition_result
LANGUAGE plpgsql
AS $$
DECLARE
  v_case record;
  v_rule record;
  v_now timestamptz := clock_timestamp();
  v_audit_id uuid := gen_random_uuid();
  v_outbox_event_id uuid := gen_random_uuid();
  v_new_version integer;
  v_result case_core.case_transition_result;
BEGIN
  IF p_actor_id IS NULL OR btrim(p_actor_id) = '' THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'ACTOR_REQUIRED',
      DETAIL = 'actor id is required for auditable case transition';
  END IF;

  SELECT
    c.case_id,
    c.tenant_id,
    c.case_reference,
    c.current_status,
    c.version
  INTO v_case
  FROM case_core.enforcement_case c
  WHERE c.case_id = p_case_id
    AND c.tenant_id = p_tenant_id
  FOR UPDATE;

  IF NOT FOUND THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'CASE_NOT_FOUND',
      DETAIL = format('tenant_id=%s case_id=%s', p_tenant_id, p_case_id);
  END IF;

  IF v_case.version <> p_expected_version THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'CASE_VERSION_CONFLICT',
      DETAIL = format(
        'case_id=%s expected_version=%s actual_version=%s',
        p_case_id,
        p_expected_version,
        v_case.version
      ),
      HINT = 'Reload the case and retry with the latest version.';
  END IF;

  SELECT r.*
  INTO v_rule
  FROM case_core.case_status_transition_rule r
  WHERE r.from_status = v_case.current_status
    AND r.to_status = p_target_status
    AND r.is_active = true;

  IF NOT FOUND THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'INVALID_CASE_TRANSITION',
      DETAIL = format(
        'case_id=%s from_status=%s to_status=%s',
        p_case_id,
        v_case.current_status,
        p_target_status
      );
  END IF;

  IF v_rule.requires_reason AND (p_reason IS NULL OR btrim(p_reason) = '') THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'TRANSITION_REASON_REQUIRED',
      DETAIL = format('transition_code=%s', v_rule.transition_code);
  END IF;

  v_new_version := v_case.version + 1;

  UPDATE case_core.enforcement_case
  SET current_status = p_target_status,
      version = v_new_version,
      updated_at = v_now
  WHERE case_id = p_case_id
    AND tenant_id = p_tenant_id;

  INSERT INTO case_core.case_audit_log (
    audit_id,
    case_id,
    tenant_id,
    audit_type,
    actor_id,
    occurred_at,
    correlation_id,
    details
  ) VALUES (
    v_audit_id,
    p_case_id,
    p_tenant_id,
    'CASE_STATUS_TRANSITION',
    p_actor_id,
    v_now,
    p_correlation_id,
    jsonb_build_object(
      'fromStatus', v_case.current_status,
      'toStatus', p_target_status,
      'transitionCode', v_rule.transition_code,
      'reason', p_reason,
      'previousVersion', v_case.version,
      'newVersion', v_new_version
    )
  );

  INSERT INTO integration.outbox_event (
    outbox_event_id,
    aggregate_type,
    aggregate_id,
    topic_name,
    partition_key,
    event_type,
    payload,
    headers,
    status,
    created_at,
    next_attempt_at
  ) VALUES (
    v_outbox_event_id,
    'CASE',
    p_case_id,
    'regulatory.case.events.v1',
    p_case_id::text,
    'CaseStatusChanged',
    jsonb_build_object(
      'eventId', v_outbox_event_id,
      'caseId', p_case_id,
      'tenantId', p_tenant_id,
      'caseReference', v_case.case_reference,
      'previousStatus', v_case.current_status,
      'newStatus', p_target_status,
      'version', v_new_version,
      'occurredAt', v_now
    ),
    jsonb_build_object(
      'correlationId', p_correlation_id,
      'actorId', p_actor_id,
      'schemaVersion', '1.0.0'
    ),
    'PENDING',
    v_now,
    v_now
  );

  v_result := (
    p_case_id,
    v_case.current_status,
    p_target_status,
    v_case.version,
    v_new_version,
    v_audit_id,
    v_outbox_event_id
  )::case_core.case_transition_result;

  RETURN v_result;
END;
$$;

The function intentionally does not publish to Kafka.

It creates a durable obligation to publish.

That is the outbox pattern.


7. Why the Function Locks the Row

The line below matters:

FOR UPDATE

Without it, two transactions can both read the same version/status and then race.

With row-level lock:

Transaction A locks case row.
Transaction B waits for same row.
Transaction A updates version/status and commits.
Transaction B resumes, sees updated row or fails version check.

This does not eliminate every concurrency issue in the system.

It makes this specific state transition atomic and serial for one case row.

For cross-case operations, different lock strategy may be needed.


8. Keep Authorization Outside, Keep Audit Context Inside

The function should not decide whether the user is allowed to transition the case unless all required authorization data is already in the database and stable.

Better split:

JAX-RS filter/auth layer -> identity validated
Java application service -> permission checked
PL/pgSQL function -> actor_id required and recorded

The database function should reject missing actor identity because the mutation must be auditable.

But it should not call an identity service.

IF p_actor_id IS NULL OR btrim(p_actor_id) = '' THEN
  RAISE EXCEPTION ... MESSAGE = 'ACTOR_REQUIRED';
END IF;

That is not authentication. It is audit integrity.


9. MyBatis Integration

A mapper can call the function explicitly.

<select id="transitionCaseStatus"
        parameterType="com.example.caseapp.persistence.TransitionCaseStatusParams"
        resultMap="CaseTransitionResultMap">
  SELECT *
  FROM case_core.transition_case_status(
    #{tenantId},
    #{caseId,jdbcType=OTHER},
    #{expectedVersion},
    #{targetStatus},
    #{actorId},
    #{reason},
    #{correlationId}
  )
</select>

Result map:

<resultMap id="CaseTransitionResultMap"
           type="com.example.caseapp.persistence.CaseTransitionResultRow">
  <id property="caseId" column="case_id" />
  <result property="previousStatus" column="previous_status" />
  <result property="newStatus" column="new_status" />
  <result property="previousVersion" column="previous_version" />
  <result property="newVersion" column="new_version" />
  <result property="auditId" column="audit_id" />
  <result property="outboxEventId" column="outbox_event_id" />
</resultMap>

Application service:

public TransitionCaseResult transition(TransitionCaseCommand command) {
    permissionService.requireCanTransition(command.actor(), command.caseId(), command.targetStatus());

    try {
        CaseTransitionResultRow row = mapper.transitionCaseStatus(
            TransitionCaseStatusParams.from(command)
        );
        return row.toDomainResult();
    } catch (PersistenceException e) {
        throw databaseErrorMapper.toApplicationException(e);
    }
}

The Java service still owns intent and permission.

The function owns atomic mutation.


10. Idempotent Command Function

Some commands must be idempotent at the API boundary.

Example:

POST /cases/{caseId}/transitions
Idempotency-Key: abc-123

The idempotency function pattern:

A simplified table:

CREATE TABLE integration.idempotency_record (
  tenant_id text NOT NULL,
  idempotency_key text NOT NULL,
  request_fingerprint text NOT NULL,
  status text NOT NULL,
  response_payload jsonb NULL,
  created_at timestamptz NOT NULL DEFAULT clock_timestamp(),
  completed_at timestamptz NULL,
  PRIMARY KEY (tenant_id, idempotency_key),
  CONSTRAINT ck_idempotency_status CHECK (status IN ('IN_PROGRESS', 'COMPLETED', 'FAILED'))
);

A helper function:

CREATE OR REPLACE FUNCTION integration.begin_idempotent_command(
  p_tenant_id text,
  p_idempotency_key text,
  p_request_fingerprint text
)
RETURNS text
LANGUAGE plpgsql
AS $$
DECLARE
  v_existing record;
BEGIN
  INSERT INTO integration.idempotency_record (
    tenant_id,
    idempotency_key,
    request_fingerprint,
    status
  ) VALUES (
    p_tenant_id,
    p_idempotency_key,
    p_request_fingerprint,
    'IN_PROGRESS'
  )
  ON CONFLICT (tenant_id, idempotency_key) DO NOTHING;

  IF FOUND THEN
    RETURN 'STARTED';
  END IF;

  SELECT * INTO v_existing
  FROM integration.idempotency_record
  WHERE tenant_id = p_tenant_id
    AND idempotency_key = p_idempotency_key
  FOR UPDATE;

  IF v_existing.request_fingerprint <> p_request_fingerprint THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'IDEMPOTENCY_KEY_REUSED_WITH_DIFFERENT_REQUEST';
  END IF;

  IF v_existing.status = 'COMPLETED' THEN
    RETURN 'ALREADY_COMPLETED';
  END IF;

  RETURN 'IN_PROGRESS';
END;
$$;

This function alone is not the full design. You still need response storage, timeout policy, and recovery behavior.

But the invariant belongs in the database:

same tenant + same key -> same fingerprint or reject

11. Outbox Functions

The outbox publisher needs atomic claim and completion operations.

11.1 Claim Batch

CREATE OR REPLACE FUNCTION integration.claim_outbox_batch(
  p_batch_size integer,
  p_worker_id text
)
RETURNS SETOF integration.outbox_event
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  WITH candidate AS (
    SELECT outbox_event_id
    FROM integration.outbox_event
    WHERE status IN ('PENDING', 'FAILED')
      AND next_attempt_at <= clock_timestamp()
    ORDER BY next_attempt_at ASC, created_at ASC, outbox_event_id ASC
    LIMIT p_batch_size
    FOR UPDATE SKIP LOCKED
  )
  UPDATE integration.outbox_event o
  SET status = 'PUBLISHING',
      attempt_count = attempt_count + 1,
      locked_by = p_worker_id,
      locked_at = clock_timestamp()
  FROM candidate c
  WHERE o.outbox_event_id = c.outbox_event_id
  RETURNING o.*;
END;
$$;

The SKIP LOCKED pattern enables multiple workers to claim different rows without blocking each other on already-claimed rows.

11.2 Mark Published

CREATE OR REPLACE FUNCTION integration.mark_outbox_published(
  p_outbox_event_id uuid,
  p_worker_id text
)
RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
  UPDATE integration.outbox_event
  SET status = 'PUBLISHED',
      published_at = clock_timestamp(),
      locked_by = NULL,
      locked_at = NULL,
      last_error = NULL
  WHERE outbox_event_id = p_outbox_event_id
    AND status = 'PUBLISHING'
    AND locked_by = p_worker_id;

  IF NOT FOUND THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'OUTBOX_EVENT_NOT_OWNED_BY_WORKER',
      DETAIL = format('outbox_event_id=%s worker_id=%s', p_outbox_event_id, p_worker_id);
  END IF;
END;
$$;

11.3 Mark Failed

CREATE OR REPLACE FUNCTION integration.mark_outbox_failed(
  p_outbox_event_id uuid,
  p_worker_id text,
  p_error text
)
RETURNS void
LANGUAGE plpgsql
AS $$
DECLARE
  v_attempt_count integer;
  v_next_attempt_at timestamptz;
BEGIN
  SELECT attempt_count
  INTO v_attempt_count
  FROM integration.outbox_event
  WHERE outbox_event_id = p_outbox_event_id
    AND status = 'PUBLISHING'
    AND locked_by = p_worker_id
  FOR UPDATE;

  IF NOT FOUND THEN
    RAISE EXCEPTION USING
      ERRCODE = 'P0001',
      MESSAGE = 'OUTBOX_EVENT_NOT_OWNED_BY_WORKER';
  END IF;

  v_next_attempt_at := clock_timestamp() + make_interval(secs => LEAST(300, power(2, v_attempt_count)::integer));

  UPDATE integration.outbox_event
  SET status = 'FAILED',
      next_attempt_at = v_next_attempt_at,
      locked_by = NULL,
      locked_at = NULL,
      last_error = left(p_error, 2000)
  WHERE outbox_event_id = p_outbox_event_id;
END;
$$;

Keep retry math simple and observable. More complex policies can live in Java if easier to test.


12. Trigger Functions: Use Sparingly

PL/pgSQL trigger functions can enforce automatic behavior, but they hide control flow.

Good trigger use cases:

  • maintain updated_at,
  • append low-level audit for sensitive tables,
  • prevent direct mutation of append-only tables,
  • enforce cross-column normalization.

Bad trigger use cases:

  • start business workflows,
  • call external integrations,
  • silently create domain events from every update,
  • replace application commands,
  • implement complex state machine invisibly.

Example updated_at trigger:

CREATE OR REPLACE FUNCTION case_core.set_updated_at()
RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
  NEW.updated_at := clock_timestamp();
  RETURN NEW;
END;
$$;

CREATE TRIGGER trg_case_set_updated_at
BEFORE UPDATE ON case_core.enforcement_case
FOR EACH ROW
EXECUTE FUNCTION case_core.set_updated_at();

This is acceptable because it is simple, local, and unsurprising.

A trigger that inserts Kafka outbox events on every update is more dangerous because it may publish events for internal repair, migration, or administrative corrections unless carefully filtered.


13. SECURITY DEFINER Risk

SECURITY DEFINER means the function runs with the privileges of the function owner, not the caller.

This can be useful for exposing a narrow safe mutation surface while denying direct table writes.

But it is risky.

If you use it:

  • set a safe search_path,
  • schema-qualify objects,
  • keep the function small,
  • revoke public execute if needed,
  • grant execute to specific application role,
  • avoid dynamic SQL where possible,
  • validate all inputs,
  • never concatenate untrusted strings into SQL.

Example skeleton:

CREATE OR REPLACE FUNCTION case_core.safe_transition_case_status(...)
RETURNS case_core.case_transition_result
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = case_core, integration, pg_temp
AS $$
BEGIN
  -- schema-qualified access still preferred for clarity
END;
$$;

REVOKE ALL ON FUNCTION case_core.safe_transition_case_status(...) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION case_core.safe_transition_case_status(...) TO app_case_service;

Do not use SECURITY DEFINER as a shortcut around proper privileges. Use it as a carefully audited interface.


14. Dynamic SQL Discipline

Dynamic SQL is powerful and dangerous.

Avoid it unless needed for:

  • generic migration helpers,
  • partition maintenance,
  • admin tooling,
  • generated report functions with controlled identifiers.

Bad:

EXECUTE 'SELECT * FROM ' || p_table_name || ' WHERE id = ''' || p_id || '''';

Better:

EXECUTE format('SELECT * FROM %I.%I WHERE id = $1', p_schema_name, p_table_name)
USING p_id;

Use format('%I', identifier) for identifiers and USING for values.

Still, most production command functions should not need dynamic SQL.


15. Migration-Safe Function Evolution

Function signatures are contracts.

Changing a function parameter list can break Java/MyBatis callers.

Safer strategies:

15.1 Add New Function Version

case_core.transition_case_status_v1(...)
case_core.transition_case_status_v2(...)

Use when behavior or result contract changes meaningfully.

15.2 Add Wrapper Function

CREATE OR REPLACE FUNCTION case_core.transition_case_status(
  p_tenant_id text,
  p_case_id uuid,
  p_expected_version integer,
  p_target_status text,
  p_actor_id text,
  p_reason text,
  p_correlation_id text
)
RETURNS case_core.case_transition_result
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN case_core.transition_case_status_v2(
    p_tenant_id,
    p_case_id,
    p_expected_version,
    p_target_status,
    p_actor_id,
    p_reason,
    p_correlation_id,
    NULL
  );
END;
$$;

Use when you need backward compatibility during rolling deploy.

15.3 Avoid Breaking Return Type Casually

If MyBatis maps columns by name, adding columns may be okay. Renaming/removing columns is breaking.

For command results, prefer explicit result columns with stable names.


16. Testing PL/pgSQL

Do not test database functions only through HTTP.

You need layered tests.

Test layerPurpose
SQL unit testdirect function behavior and SQLSTATE
Integration testJava/MyBatis mapping and transaction behavior
Concurrency testlock/version conflict behavior
Contract testresult columns and error codes remain stable
Migration testold and new app versions work during rollout
Recovery testoutbox/inbox/idempotency repair behavior

Example SQL test shape:

BEGIN;

SELECT case_core.transition_case_status(
  'tenant-a',
  '00000000-0000-0000-0000-000000000001'::uuid,
  1,
  'INTAKE_VALIDATION',
  'user-123',
  NULL,
  'corr-001'
);

SELECT current_status, version
FROM case_core.enforcement_case
WHERE case_id = '00000000-0000-0000-0000-000000000001'::uuid;

SELECT audit_type
FROM case_core.case_audit_log
WHERE case_id = '00000000-0000-0000-0000-000000000001'::uuid;

SELECT event_type, status
FROM integration.outbox_event
WHERE aggregate_id = '00000000-0000-0000-0000-000000000001'::uuid;

ROLLBACK;

In Java integration tests, use Testcontainers PostgreSQL and execute the same mapper call the application uses.


17. Observability for Database Functions

A function is production code. It needs observability.

At minimum, expose metrics at the Java layer:

  • function call count,
  • function latency,
  • SQLSTATE count,
  • domain error count,
  • lock timeout count,
  • deadlock count,
  • retry count,
  • outbox rows generated,
  • outbox age.

PostgreSQL-side logging can help during incidents, but do not spam logs from every function call.

Avoid this in hot paths:

RAISE NOTICE 'transitioning case %', p_case_id;

Use structured audit tables for domain audit, and application metrics for operational observability.


18. Performance Considerations

PL/pgSQL is usually fast enough for small transactional functions.

Performance problems appear when functions:

  • loop row by row over large sets,
  • execute unindexed queries,
  • hide expensive joins,
  • call functions per row in large queries,
  • use dynamic SQL unnecessarily,
  • do too much JSON manipulation in hot paths,
  • hold locks while doing slow work,
  • perform broad updates without batching.

Bad pattern:

FOR v_case IN SELECT * FROM case_core.enforcement_case WHERE current_status = 'DRAFT'
LOOP
  UPDATE case_core.enforcement_case SET ... WHERE case_id = v_case.case_id;
END LOOP;

Better set-based update if the rule is simple:

UPDATE case_core.enforcement_case
SET current_status = 'INTAKE_VALIDATION',
    version = version + 1,
    updated_at = clock_timestamp()
WHERE current_status = 'DRAFT'
  AND created_at < clock_timestamp() - interval '7 days';

For command functions, row-level logic is fine because one command usually targets one aggregate.

For bulk repair, use batch size and explicit progress tracking.


19. Transaction Boundary with Java

A PL/pgSQL function runs inside the transaction started by the caller.

If Java uses a transaction manager:

begin transaction
  call function
  maybe perform additional DB reads
commit

The function does not commit independently.

That is important.

Do not design function behavior assuming it commits immediately. The caller can still roll back.

This is usually good: API command either commits all database effects or none.

But it means:

  • do not publish Kafka inside the function,
  • do not rely on outbox rows being visible before commit,
  • do not hold transactions open while calling external systems,
  • keep Java transaction scope short.

20. Camunda 7 Integration Boundary

Camunda 7 process execution may call Java delegates. Those delegates may call database functions.

Be careful with transaction semantics.

A BPMN service task that calls transition_case_status may run inside a Camunda-managed transaction depending on engine integration.

The safe rule:

BPMN step calls application service.
Application service calls DB function.
DB function mutates domain + audit + outbox.
If delegate fails, transaction rolls back and Camunda retry/incident semantics apply.

Do not let Camunda process variables become the durable domain state.

After calling the function, store only what the process needs:

caseId
businessKey
newVersion
transitionResultCode

The durable truth remains in PostgreSQL.


21. Failure Model

Failure: Java Crashes After Function Returns Before HTTP Response

Database transaction may have committed or rolled back depending on crash timing.

Defense:

  • idempotency key,
  • durable audit,
  • durable outbox,
  • read-after-retry behavior.

Failure: Function Updates Case But Audit Insert Fails

If inside one function/transaction, entire transaction rolls back.

Defense:

  • atomic function,
  • not separate best-effort audit call.

Failure: Kafka Publish Fails

The function does not publish Kafka.

It inserts outbox row.

Defense:

  • outbox retry,
  • DLQ/quarantine policy,
  • outbox age alert.

Failure: Two Users Transition Same Case

Defense:

  • row lock,
  • expected version,
  • stable conflict error.

Failure: Function Signature Breaks Rolling Deploy

Defense:

  • function versioning,
  • wrapper functions,
  • expand-contract release.

Failure: SECURITY DEFINER Escalation

Defense:

  • safe search path,
  • least privilege,
  • no unsafe dynamic SQL,
  • explicit grants,
  • audit review.

22. Production Checklist

Before creating or changing a PL/pgSQL function:

  • Is the function a small atomic primitive, not a hidden application workflow?
  • Is the input contract explicit?
  • Is the return contract stable?
  • Are all touched tables schema-qualified?
  • Is the lock strategy documented?
  • Are expected version/status checks included where needed?
  • Are audit and outbox writes atomic with mutation?
  • Are domain errors stable and mappable in Java?
  • Are SQLSTATE/constraint violations tested?
  • Does the function avoid external calls?
  • Does the function avoid broad unbounded scans?
  • Is the function safe under rolling deploy?
  • If SECURITY DEFINER is used, is search_path controlled?
  • Is the function covered by SQL and Java integration tests?
  • Are metrics available at the Java boundary?

23. Anti-Patterns

Anti-patternWhy it fails
Move all business logic into PL/pgSQLCreates hidden application server with weak tooling and ownership
Let Java parse free-form database error textFragile and breaks localization/message changes
Publish Kafka from database functionExternal side effect cannot commit atomically with DB transaction
Use triggers for major business workflowsMakes behavior invisible and hard to reason about
Use SECURITY DEFINER casuallyCan create privilege escalation vulnerabilities
Use JSONB command for every functionLoses database type checking and contract clarity
Change function signature during rolling deployBreaks old application pods
Hold locks while doing slow workCauses contention and incident cascades
Catch all exceptions and return successCorrupts operational truth
Create audit as best effort after mutationDestroys regulatory defensibility

24. What You Should Internalize

PL/pgSQL is most valuable when it makes a dangerous race impossible.

It should not replace Java application architecture.

It should compress a critical consistency window into one durable database operation:

check -> lock -> mutate -> audit -> outbox -> return

That pattern is the heart of production-grade regulatory persistence.

The database function should be boring, explicit, tested, versioned, observable, and hard to misuse.

If a function feels clever, it is probably too clever.


25. References

Lesson Recap

You just completed lesson 22 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.