Deepen PracticeOrdered learning track

CI/CD for Independent Deployability

Learn Java Microservices Design and Architect - Part 073

CI/CD for independent deployability in Java microservices: pipeline as product, artifact immutability, contract verification, deployment gates, promotion, rollback, roll-forward, and production verification.

22 min read4400 words
PrevNext
Lesson 73100 lesson track55–82 Deepen Practice
#java#microservices#architecture#ci-cd+6 more

Part 073 — CI/CD for Independent Deployability

1. Core Idea

A microservice is not truly independent just because it has its own repository.

It is independent when the owning team can safely change, verify, deploy, observe, and recover the service without forcing unrelated teams into the same release event.

That requires more than a CI pipeline.

It requires a delivery system that treats every service as a product with a runtime contract.

A production-grade delivery path answers these questions:

  • Can this service be built reproducibly?
  • Can this artifact be traced back to source, commit, dependency set, and pipeline run?
  • Can this change be verified without deploying the whole enterprise?
  • Can downstream consumers remain compatible?
  • Can upstream providers remain compatible?
  • Can this artifact be promoted without rebuilding?
  • Can the deployment be stopped if runtime signals become unhealthy?
  • Can we roll forward quickly if rollback is unsafe?
  • Can we explain what changed during an incident?
  • Can we prove that required gates were executed?

For top-tier microservices engineering, CI/CD is not a DevOps checkbox.

It is the mechanism that makes architectural promises executable.

Independent deployability is an architectural property, but CI/CD is where that property either becomes real or collapses into ceremony.

2. The Mental Model: Deployment Unit vs Release Unit vs Exposure Unit

Most teams use the word “release” too loosely.

That creates confusion.

Separate these three concepts.

ConceptMeaningExample
Deployment unitWhat is installed into runtimeContainer image case-service:1.42.0 deployed to Kubernetes
Release unitWhat business capability is considered available“Risk scoring explanation is available to investigators”
Exposure unitWho can see/use the behaviorInternal users, beta tenants, 5% of traffic, one region

A mature microservices pipeline separates them.

You may deploy code today, release the feature tomorrow, and expose it gradually over a week.

That separation is essential because microservices require compatibility windows. If deployment and release are fused, every deploy becomes a coordination event.

The artifact should be immutable.

The behavior may be conditional.

The exposure may be gradual.

This gives teams operational control without rebuilding or redeploying for every business toggle.

3. Independent Deployability Is Not “Deploy Whenever You Want”

A dangerous interpretation of microservices is:

Every team can deploy anything at any time.

That is not autonomy.

That is unmanaged risk.

Independent deployability means:

A team can deploy independently because compatibility, safety gates, observability, rollback/roll-forward, and ownership are built into the delivery system.

Independence requires discipline.

A team earns independent deployability by satisfying constraints:

  • stable public contract
  • backward-compatible changes by default
  • automated verification
  • owner-approved risk exceptions
  • production readiness
  • operational telemetry
  • safe deployment strategy
  • incident response path
  • data migration discipline
  • consumer communication policy

Without these constraints, independent deployability becomes distributed chaos.

4. CI/CD as Architecture Enforcement

A pipeline is not only an automation script.

It is an executable architecture review.

Every architecture rule that can be checked should be checked before production.

Examples:

Architecture rulePipeline check
Domain layer must not depend on Spring WebArchUnit/static architecture test
API change must be backward compatibleOpenAPI diff / contract test
Event schema must remain compatibleSchema compatibility check
Service must expose health endpointComponent test / smoke test
Service must emit trace IDsObservability smoke test
Container must not run as rootImage policy check
Secrets must not be committedSecret scanning
Critical dependency must have timeoutStatic config check / integration test
New service must have owner metadataService catalog validation
Release must have rollback/roll-forward planDeployment manifest/ADR check

The more microservices you have, the more you need automation to preserve consistency.

Manual architecture review does not scale linearly with service count.

Executable checks do.

5. The Delivery Pipeline Should Be Service-Centric

A Java microservice pipeline should be designed around the service lifecycle.

A typical flow:

The exact tool does not matter as much as the invariants.

The pipeline must:

  • build once
  • test before deploy
  • package immutably
  • promote the same artifact
  • verify contracts
  • verify deployment health
  • preserve traceability
  • produce evidence
  • stop unsafe changes
  • expose runtime feedback

6. Build Once, Promote the Same Artifact

A common anti-pattern:

build dev artifact
build staging artifact
build production artifact

This makes environments incomparable.

If staging passed but production uses a separately built artifact, you did not verify the production artifact.

A stronger model:

source commit -> immutable artifact -> promote artifact across environments

Only environment-specific configuration changes.

The artifact does not.

This matters because incident diagnosis depends on artifact identity.

When production fails, you should be able to say:

  • exact commit
  • exact image digest
  • exact dependency set
  • exact config version
  • exact migration version
  • exact contract version
  • exact deployment timestamp
  • exact pipeline run

If you cannot answer those quickly, your CI/CD system is not production-grade.

7. The Java Microservice Build Contract

For a Java microservice, the build contract should be explicit.

A service build should produce:

  • compiled classes
  • test report
  • static analysis report
  • dependency vulnerability report
  • architecture test report
  • contract verification report
  • packaged application artifact
  • container image
  • image digest
  • SBOM
  • provenance/build metadata
  • deployment manifest or release bundle

A service build should not depend on:

  • developer machine state
  • local Maven cache correctness
  • hidden environment variables
  • mutable snapshot dependency in production
  • unpinned base images
  • manually installed tools
  • ad-hoc secrets

A build that cannot be reproduced cannot be trusted.

8. Repository Structure for Delivery

A service repository should make delivery intent visible.

Example:

case-service/
  pom.xml
  src/
    main/java/...
    test/java/...
  contracts/
    openapi/
    pact/
    events/
  deploy/
    helm/
    kustomize/
  ops/
    dashboards/
    alerts/
    runbooks/
  docs/
    adr/
    service-charter.md
    production-readiness.md
  .github/workflows/
    ci.yml
    release.yml

The exact layout varies, but a production-grade repository should expose:

  • code
  • contracts
  • delivery manifests
  • operational assets
  • decisions
  • ownership metadata

If a service repository contains only source code, delivery knowledge is probably scattered elsewhere.

That scattering increases incident recovery time.

9. Pipeline Stage 1 — Compile and Dependency Resolution

Compilation is the first contract check.

It verifies internal consistency of the codebase.

For Java services:

  • use deterministic Java version
  • pin Maven/Gradle wrapper version
  • avoid mutable dependency versions for production builds
  • fail on dependency convergence issues
  • enforce dependency scopes
  • run annotation processors consistently
  • validate generated code is reproducible

Example Maven command:

./mvnw -B -V -ntp clean verify

Production-grade build logic should avoid hidden behavior.

Bad:

<version>LATEST</version>

Better:

<version>2.17.3</version>

Bad:

mvn install

Better:

./mvnw -B -V -ntp clean verify

The wrapper matters because the pipeline should not depend on a globally installed build tool version.

10. Pipeline Stage 2 — Unit Tests

Unit tests verify local behavior without network, database, queue, or container runtime.

For microservices, unit tests should cover:

  • domain invariant
  • value object validation
  • command handler branching
  • policy decisions
  • mapper edge cases
  • retry classifier
  • idempotency key generation
  • state transition rules

Example domain test:

class CaseAggregateTest {

    @Test
    void cannotEscalateClosedCase() {
        CaseFile caseFile = CaseFile.closed(CaseId.of("CASE-1001"));

        assertThrows(InvalidCaseTransition.class, () ->
            caseFile.escalate(EscalationReason.REGULATORY_DEADLINE)
        );
    }
}

Unit tests should be fast enough to run on every commit.

If unit tests require a live database, they are not unit tests.

That does not mean database tests are bad.

It means they belong to another stage with another cost model.

11. Pipeline Stage 3 — Static Analysis and Security Scanning

Static analysis catches defects before runtime.

Useful gates:

  • compiler warnings as errors where practical
  • formatting/linting
  • forbidden dependency checks
  • dependency vulnerability scanning
  • secret scanning
  • license policy checks
  • code quality threshold
  • nullness/error-prone checks when adopted

A mature pipeline distinguishes between:

  • hard blockers
  • warnings
  • risk exceptions
  • time-bounded waivers

Not every finding should block every deploy.

But every exception should be explicit.

Example waiver metadata:

waiver:
  id: CVE-2026-12345
  service: case-service
  reason: vulnerable code path not reachable; upgrade blocked by provider SDK
  owner: case-platform-team
  expires: 2026-08-15
  approvedBy: security-architecture

Waivers without expiration become normalized risk.

12. Pipeline Stage 4 — Architecture Tests

Architecture tests verify internal dependency rules.

Example rules:

  • API layer may depend on application layer
  • application layer may depend on domain ports
  • domain layer must not depend on infrastructure
  • infrastructure may implement domain ports
  • controllers must not call repositories directly
  • adapters must not leak external DTOs into domain
  • generated API classes must not become domain model

Example using ArchUnit style:

@AnalyzeClasses(packages = "com.example.casefile")
class ArchitectureRulesTest {

    @ArchTest
    static final ArchRule domain_should_not_depend_on_spring =
        noClasses()
            .that().resideInAPackage("..domain..")
            .should().dependOnClassesThat()
            .resideInAnyPackage("org.springframework..", "jakarta.persistence..");

    @ArchTest
    static final ArchRule controllers_should_not_access_repositories =
        noClasses()
            .that().resideInAPackage("..api..")
            .should().dependOnClassesThat()
            .resideInAPackage("..repository..");
}

Architecture tests are not academic.

They keep service internals from collapsing into accidental coupling.

13. Pipeline Stage 5 — Component Tests

A component test verifies a service from its public boundary while replacing external dependencies with controlled doubles.

For a Java HTTP service:

  • start application context
  • call HTTP endpoint
  • use test database or containerized database
  • stub external HTTP services
  • verify response
  • verify DB state
  • verify outbox event

Example:

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@Testcontainers
class RegisterAllegationComponentTest {

    @Autowired TestRestTemplate http;
    @Autowired OutboxRepository outbox;

    @Test
    void registersAllegationAndPublishesIntegrationEvent() {
        var request = new RegisterAllegationRequest(
            "CASE-1001",
            "MISCONDUCT",
            "Evidence summary"
        );

        ResponseEntity<ProblemOrAllegationResponse> response =
            http.postForEntity("/cases/CASE-1001/allegations", request, ProblemOrAllegationResponse.class);

        assertEquals(HttpStatus.CREATED, response.getStatusCode());
        assertThat(outbox.findByAggregateId("CASE-1001"))
            .anyMatch(event -> event.type().equals("AllegationRegistered"));
    }
}

Component tests are valuable because they verify the service as a unit of deployment.

They are slower than unit tests but much cheaper than full end-to-end tests.

14. Pipeline Stage 6 — Contract Tests

Microservices fail at integration boundaries.

Contract tests protect those boundaries.

You need multiple contract types:

Contract typeProtects
HTTP API contractRequest/response compatibility
Event contractPublished event compatibility
gRPC/protobuf contractRPC method/message compatibility
Consumer-driven contractActual consumer expectations
Provider contract verificationProvider does not break consumers

Consumer-driven contract testing is especially useful when many consumers depend on one provider.

The consumer defines the subset of provider behavior it uses.

The provider verifies those expectations before release.

The important rule:

A provider cannot safely deploy a breaking change until affected consumers are compatible.

Contract testing turns that rule into an automated gate.

15. Pipeline Stage 7 — Integration Tests

Integration tests verify behavior against real dependencies or near-real test environments.

They are expensive.

Use them carefully.

Good integration tests verify:

  • database migration correctness
  • outbox publisher behavior
  • message broker integration
  • service mesh/gateway behavior
  • external provider adapter behavior
  • authentication/authorization integration
  • observability wiring

Bad integration tests attempt to cover every business branch end-to-end.

That creates slow, flaky pipelines.

A better test pyramid for microservices:

End-to-end tests should validate critical journeys, not every condition.

16. Pipeline Stage 8 — Packaging and Container Image

The container image is the runtime artifact.

It should be:

  • small enough to pull quickly
  • secure enough for baseline policy
  • deterministic enough for traceability
  • observable enough for operations
  • compatible with container resource limits
  • explicit about exposed ports
  • non-root where possible
  • free of build-time secrets

Example Dockerfile:

FROM eclipse-temurin:21-jre

WORKDIR /app
COPY target/case-service.jar /app/case-service.jar

USER 10001:10001
EXPOSE 8080

ENTRYPOINT ["java", "-jar", "/app/case-service.jar"]

In many enterprise platforms, teams should not hand-roll every Dockerfile.

The platform should provide a golden base image or buildpack path.

The application team still owns application behavior.

The platform team owns hardened runtime baseline.

17. Pipeline Stage 9 — SBOM, Provenance, and Artifact Evidence

As systems scale, “what is running?” becomes a security and incident response question.

A mature pipeline should produce evidence:

  • software bill of materials
  • dependency list
  • source commit
  • build timestamp
  • builder identity
  • image digest
  • vulnerability scan result
  • test result
  • signer/provenance metadata
  • deployment approval if required

This evidence helps answer:

  • Are we affected by a new CVE?
  • Which services run library X?
  • Which artifact entered production after incident start?
  • Which pipeline built this image?
  • Was a required scan bypassed?
  • Which team owns the risk?

Without evidence, incident response becomes archaeology.

18. Pipeline Stage 10 — Deployment Manifest Validation

Before deployment, validate the runtime contract.

For Kubernetes workloads, check:

  • resource requests/limits
  • readiness/liveness/startup probes
  • graceful termination settings
  • security context
  • network policy expectations
  • service account
  • config references
  • secret references
  • autoscaling policy
  • pod disruption budget when required
  • topology spread constraints when required
  • observability annotations
  • sidecar configuration if mesh is used

Example policy intent:

rules:
  - name: require-readiness-probe
    appliesTo: Deployment
    severity: block
  - name: forbid-root-container
    appliesTo: PodSpec
    severity: block
  - name: require-service-owner-label
    appliesTo: all
    severity: block
  - name: require-resource-requests
    appliesTo: Container
    severity: block

Delivery pipelines should fail before unsafe manifests reach the cluster.

19. Pipeline Stage 11 — Environment Promotion

A production-grade service usually passes through environments.

Example:

local -> ephemeral preview -> integration -> staging -> production

But beware environment theater.

More environments do not automatically mean more safety.

Each environment must have a clear purpose.

EnvironmentPurpose
Localdeveloper feedback
PreviewPR-level integration check
Integrationservice collaboration validation
Stagingproduction-like deployment rehearsal
Productionreal user/runtime verification

The same artifact should move through these environments.

The difference should be configuration, data, scale, and exposure.

20. Ephemeral Environments

Ephemeral environments are created per branch, pull request, or feature.

They are useful when:

  • service has complex UI/API integration
  • consumer/provider changes need early validation
  • database migration needs rehearsal
  • gateway routing needs verification
  • multiple teams collaborate on a temporary change

But ephemeral environments can become expensive and unreliable.

Use them for high-value integration checks, not as a replacement for modular tests.

Minimum ephemeral environment contract:

ephemeralEnvironment:
  ttl: 72h
  owner: case-platform-team
  sourceRef: pull-request-1842
  dataProfile: synthetic-minimal
  externalDependencies: stubbed
  destroyPolicy: automatic

Without TTL and ownership, ephemeral environments become zombie infrastructure.

21. Pull Request Gate vs Mainline Gate

Not every check belongs in every PR.

PR checks should be fast enough to keep developer flow.

Mainline checks can be deeper.

Release checks can be even deeper.

Example:

CheckPRMainRelease
Compileyesyesyes
Unit testsyesyesyes
Architecture testsyesyesyes
Component testsyesyesyes
Contract generationyesyesyes
Provider contract verificationmaybeyesyes
Image buildmaybeyesyes
Full vulnerability scanmaybeyesyes
Integration environment deployoptionalyesyes
Staging deploymentnooptionalyes
Production canarynonoyes

The objective is not to maximize checks everywhere.

The objective is to maximize risk reduction per unit of feedback time.

22. CI Pipeline Example

Example GitHub Actions-like flow:

name: case-service-ci

on:
  pull_request:
  push:
    branches: [main]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          distribution: temurin
          java-version: '21'
          cache: maven

      - name: Compile and test
        run: ./mvnw -B -ntp clean verify

      - name: Architecture tests
        run: ./mvnw -B -ntp -Parchitecture-test test

      - name: Contract tests
        run: ./mvnw -B -ntp -Pcontract-test verify

      - name: Build image
        if: github.ref == 'refs/heads/main'
        run: |
          docker build -t registry.example.com/case-service:${GITHUB_SHA} .
          docker push registry.example.com/case-service:${GITHUB_SHA}

This is only illustrative.

Real pipelines need secrets handling, signing, scanning, provenance, deployment, and promotion.

23. CD Pipeline Example

Example deployment flow:

release:
  artifact:
    image: registry.example.com/case-service@sha256:abc123
    commit: abc123
    service: case-service

  gates:
    - contract-verification
    - image-vulnerability-policy
    - manifest-policy
    - migration-dry-run
    - production-readiness-score

  environments:
    - name: staging
      strategy: rolling
      verify:
        - smoke-tests
        - synthetic-journey
        - telemetry-presence

    - name: production
      strategy: canary
      canary:
        steps:
          - weight: 5
            duration: 10m
          - weight: 25
            duration: 20m
          - weight: 50
            duration: 30m
          - weight: 100
        analysis:
          - availability-slo
          - p95-latency
          - error-rate
          - dependency-timeout-rate
          - business-failure-rate

The best deployment pipeline is not the one with the most YAML.

It is the one that expresses risk controls clearly.

24. Production Verification

Deployment is not complete when Kubernetes accepts a manifest.

Deployment is complete when the service behaves correctly in production.

Post-deploy verification should check:

  • pods become ready
  • no crash loops
  • request traffic succeeds
  • latency remains within threshold
  • error rate remains within threshold
  • dependency failures do not spike
  • consumer contract errors do not spike
  • business error rate does not spike
  • event publishing still works
  • consumer lag remains acceptable
  • dashboards show expected version
  • logs/traces include deployment version

A minimal post-deploy smoke test:

curl -fsS \
  -H "X-Synthetic-Test: true" \
  https://api.example.com/case-service/internal/smoke/readiness

But synthetic tests should not mutate production state unless explicitly designed for safe test tenants/data.

25. Rollback vs Roll Forward

Rollback is comforting, but not always safe.

Rollback is easy when:

  • code change is stateless
  • no database migration occurred
  • no event schema changed incompatibly
  • no external side effect occurred
  • no data written by new version is unreadable by old version

Rollback is risky when:

  • migration is destructive
  • new version writes new state shape
  • consumers already depend on new event fields
  • feature changed external provider state
  • workflow instances started under new version

Therefore mature delivery systems prefer:

  • backward-compatible changes
  • expand-contract database migration
  • feature flags
  • dark launch
  • fast roll-forward patches
  • compatibility windows

Rollback is still useful for crash-looping code.

But for semantic changes, roll-forward is often safer.

26. Database Change Gates

Even though this series has separate database/design material, microservice CI/CD must treat data changes as deployment risks.

Database changes need gates:

  • migration syntax check
  • migration ordering check
  • backward compatibility review
  • destructive change detection
  • large-table migration risk review
  • lock-time estimation
  • rollback/roll-forward plan
  • application version compatibility matrix

Example expand-contract sequence:

A pipeline should block destructive changes unless they are explicitly approved and scheduled.

27. Event Contract Gates

Event-driven systems need compatibility gates too.

A producer should not publish events that break consumers.

Check:

  • event name stability
  • event version compatibility
  • required field changes
  • enum changes
  • field removal
  • type change
  • semantic change
  • ordering expectation change
  • partition key change

Example event compatibility rule:

compatibility:
  event: CaseEscalated
  allowed:
    - add_optional_field
    - add_nullable_field
    - add_new_event_type
  forbidden:
    - remove_field
    - rename_field
    - change_field_type
    - change_partition_key
    - change_semantic_meaning_without_new_event

The worst event breaking change is not a field removal.

It is a semantic change that keeps the same schema.

Example:

Before: CaseEscalated means case requires supervisor review.
After:  CaseEscalated means case was merely suggested for review.

Schema compatibility cannot detect semantic incompatibility.

Humans and ADRs still matter.

28. Contract Verification Is Not Full Integration Testing

Contract testing is narrow by design.

It asks:

Does provider behavior still satisfy consumer expectations?

It does not ask:

Does the whole business journey work across all services?

That distinction is useful.

A large end-to-end suite is slow and fragile.

A contract suite is targeted and fast.

Use contract tests to protect service autonomy.

Use a few end-to-end synthetic journeys to protect critical user journeys.

Do not invert the ratio.

29. Release Evidence Pack

Every production deployment should produce a release evidence pack.

Example:

releaseEvidence:
  service: case-service
  version: 1.42.0
  imageDigest: sha256:abc123
  commit: 9f7c2e4
  pipelineRun: https://ci.example.com/runs/9321
  deployedAt: 2026-07-05T13:45:00+07:00
  deployedBy: delivery-bot
  approver: case-platform-team
  changes:
    - ADR-0184-risk-score-explanation
    - PR-1842
  gates:
    unitTests: passed
    componentTests: passed
    contractVerification: passed
    vulnerabilityPolicy: passed
    manifestPolicy: passed
    canaryAnalysis: passed
  rollback:
    strategy: roll-forward-preferred
    previousVersion: 1.41.3
  telemetry:
    dashboard: https://observability.example.com/d/case-service
    traceQuery: service.name="case-service" AND deployment.version="1.42.0"

This evidence helps during:

  • incident review
  • audit
  • compliance review
  • debugging
  • release notes
  • rollback decision
  • security investigation

If your deployment cannot produce evidence, it is not enterprise-grade.

30. Environment Configuration Gates

Configuration is runtime behavior.

The pipeline should validate config before deployment.

Check:

  • required properties present
  • values within expected range
  • timeout budget valid
  • retry count within policy
  • rate limit sane
  • feature flag default safe
  • dependency endpoint allowed
  • secret reference exists
  • tenant config valid
  • region config valid
  • observability config present

Example config validation:

@ConfigurationProperties(prefix = "case.dependency.decision")
@Validated
public record DecisionClientProperties(
    @NotBlank URI baseUrl,
    @DurationMin(millis = 50) Duration connectTimeout,
    @DurationMax(seconds = 3) Duration responseTimeout,
    @Min(0) @Max(2) int maxRetries
) {}

Invalid config should fail before production traffic reaches the service.

31. Deployment Gate Design

Not every service needs the same gates.

Use risk-based gates.

High-risk service examples:

  • payment
  • enforcement decision
  • identity
  • audit log
  • regulatory reporting
  • workflow coordinator
  • central customer profile

Low-risk service examples:

  • internal metadata viewer
  • read-only catalog
  • non-critical recommendation service

Gate intensity should reflect risk.

Example:

GateLow riskMedium riskHigh risk
Unit/component testsyesyesyes
Contract verificationyesyesyes
Security scanyesyesyes
Manual approvalnoconditionalyes for risky change
Canaryoptionalyesyes
Synthetic transactionoptionalyesyes
Audit evidence checknoconditionalyes
DR impact reviewnoconditionalyes

Too many gates for every service creates bypass culture.

Too few gates for critical services creates incident culture.

32. Release Approval Should Be Policy-Based

Manual approval can be useful.

But manual approval without context is weak.

Approvers need evidence:

  • what changed
  • risk category
  • test results
  • contract impact
  • migration impact
  • dependency impact
  • rollback/roll-forward plan
  • canary plan
  • observability plan
  • feature flag plan

Better:

approvalRequired:
  when:
    - destructiveDatabaseChange: true
    - publicApiBreakingChange: true
    - regulatoryDecisionPathChanged: true
    - auditEventSchemaChanged: true
    - serviceCriticality: tier-0

Manual approval should be reserved for judgment calls.

Routine safety should be automated.

33. Progressive Delivery

Progressive delivery reduces blast radius.

Patterns:

  • rolling deployment
  • canary
  • blue-green
  • shadow traffic
  • dark launch
  • feature flag exposure

A typical canary:

Canary analysis should include technical and business signals.

Technical:

  • error rate
  • latency percentile
  • saturation
  • dependency timeout
  • restart rate

Business:

  • command failure rate
  • validation rejection rate
  • payment failure rate
  • case escalation completion rate
  • workflow stuck count

A deploy can pass HTTP metrics while breaking business semantics.

34. Deployment Does Not Replace Feature Flags

Deployment changes what code exists in runtime.

Feature flags change which behavior is active.

Use both.

Example:

public RiskExplanation explainRisk(CaseId caseId) {
    if (flags.isEnabled("risk.explanation.v2", TenantContext.current())) {
        return riskExplanationV2.explain(caseId);
    }
    return riskExplanationV1.explain(caseId);
}

Pipeline requirements for feature flags:

  • default value explicit
  • owner explicit
  • expiry explicit
  • cleanup issue created
  • flag type explicit
  • production change audit logged
  • unsafe combinations tested

Feature flags without lifecycle management become permanent complexity.

35. Trunk-Based Development and Microservices

Independent deployability works best when code integrates frequently.

Long-lived branches delay integration risk.

Microservices do not eliminate merge risk.

They move integration risk to contracts and runtime collaboration.

Practical rules:

  • keep main branch releasable
  • use short-lived branches
  • hide incomplete behavior behind flags
  • use branch-by-abstraction for large changes
  • keep migrations backward compatible
  • run contract tests continuously
  • deploy small changes frequently

Small, frequent, reversible changes are easier to reason about than large release bundles.

36. Avoiding Pipeline Monoliths

A central pipeline template is useful.

A pipeline monolith is dangerous.

Symptoms:

  • every service must use identical stages regardless of risk
  • small pipeline changes break dozens of teams
  • teams copy-paste YAML and drift silently
  • platform team becomes bottleneck for delivery changes
  • service-specific checks are hard to add

Better:

  • golden pipeline template
  • versioned reusable workflows
  • service-level extension points
  • policy-as-code for mandatory gates
  • local service ownership for domain-specific checks

Example:

uses: platform/java-service-pipeline@v4
with:
  java-version: 21
  service-tier: tier-1
  contract-tests: true
  deployment-strategy: canary
  requires-audit-evidence: true
custom-stages:
  - name: regulatory-workflow-simulation
    command: ./mvnw -Pworkflow-simulation verify

The platform should provide a paved road, not a prison.

37. CI/CD Failure Modes

Failure modeSymptomArchitectural causeCountermeasure
Slow pipelineTeams bypass checksTests not layeredSplit fast/slow gates
Flaky testsLow trust in pipelineEnvironment nondeterminismHermetic tests, quarantine policy
Rebuild per envStaging differs from prodArtifact immutability absentBuild once, promote digest
Lockstep releaseMany teams deploy togetherBreaking contractsCompatibility windows
Rollback failsOld code cannot read new stateDestructive migrationExpand-contract
Silent contract breakConsumer fails after deployProvider-only testingCDC/provider verification
Zombie feature flagsCode complexity growsNo lifecycleOwner/expiry/cleanup gate
Manual approval theaterApprovers rubber-stampNo evidenceRisk-based evidence pack
Pipeline sprawlEvery repo differsNo platform golden pathReusable pipeline templates
Unsafe production deployMetrics ignoredNo runtime verificationProgressive delivery analysis

38. Java Service Pipeline Checklist

Before a Java microservice is considered independently deployable, it should satisfy:

  • build uses pinned Java/build-tool version
  • artifact built once and promoted by digest
  • unit tests cover domain invariant
  • component tests cover service boundary
  • contract tests protect consumers/providers
  • architecture tests protect dependency direction
  • image scan and dependency scan run automatically
  • SBOM/provenance/evidence produced
  • deployment manifest validates probes/resources/security context
  • database migration strategy is backward compatible
  • event/API compatibility checked
  • config validated before startup
  • post-deploy smoke test exists
  • canary or progressive strategy exists for non-trivial risk
  • rollback/roll-forward plan exists
  • deployment version visible in logs/metrics/traces
  • release evidence retained
  • owner and runbook linked in service catalog

39. Case Study: Regulatory Case Service Pipeline

Imagine case-service owns case lifecycle commands.

A change adds a new command:

POST /cases/{caseId}/escalation-recommendations

Risk factors:

  • affects investigator workflow
  • writes new audit event
  • publishes event consumed by notification service
  • adds read model field used by dashboard
  • requires database migration
  • may trigger SLA timer

Pipeline gates:

service: case-service
change: escalation-recommendation-command
risk: high
requiredGates:
  - domain-unit-tests
  - component-command-tests
  - audit-event-contract-check
  - notification-consumer-contract-check
  - dashboard-read-model-compatibility-check
  - migration-expand-contract-check
  - workflow-sla-simulation
  - canary-analysis
  - release-evidence-pack

A shallow pipeline would only compile and deploy.

A production-grade pipeline understands the business impact of the change.

40. CI/CD Design ADR Template

Use an ADR when introducing or changing delivery strategy.

# ADR: Delivery Strategy for Case Service

## Context
Case Service is a tier-1 service. It owns case lifecycle commands and audit-sensitive state changes.

## Decision
Use build-once-promote-by-digest pipeline with component tests, contract verification,
manifest policy checks, migration compatibility checks, and canary deployment for production.

## Alternatives Considered
1. Direct deploy on merge
2. Manual release bundle
3. Shared release train

## Consequences
Positive:
- independent deployment possible
- provider/consumer compatibility checked
- release evidence available for audit

Negative:
- longer pipeline than low-risk services
- requires maintaining contract tests
- requires deployment metadata discipline

## Fitness Functions
- all public API changes run compatibility diff
- all event changes run schema compatibility check
- production deploy must expose deployment.version metric
- canary aborts on SLO burn-rate threshold

## Rollback / Roll-forward
Roll-forward preferred for semantic or migration changes. Rollback allowed for crash-loop or non-data-writing changes.

41. A Practical Implementation Sequence

Do not try to build a perfect platform overnight.

Sequence it.

Step 1 — Make builds reproducible

  • use wrappers
  • pin runtime versions
  • build in clean CI
  • publish immutable image digest

Step 2 — Add fast correctness gates

  • compile
  • unit tests
  • component tests
  • static checks

Step 3 — Add contract gates

  • HTTP/gRPC contract
  • event compatibility
  • consumer-driven contracts for critical consumers

Step 4 — Add deployment verification

  • readiness/liveness/startup
  • smoke tests
  • deployment version telemetry

Step 5 — Add progressive delivery

  • canary or blue-green
  • automated analysis
  • rollback/roll-forward policy

Step 6 — Add governance evidence

  • SBOM
  • provenance
  • release evidence
  • service catalog linkage

Step 7 — Optimize developer experience

  • reusable pipeline
  • service template
  • self-service environment
  • local pipeline simulation

The order matters.

A sophisticated canary is less useful if the artifact is not immutable.

42. What Top Engineers Notice

Average engineers ask:

Did the pipeline pass?

Strong engineers ask:

What risk did the pipeline actually reduce?

Average engineers ask:

Can we deploy this service separately?

Strong engineers ask:

Can we deploy this service separately without breaking consumers, corrupting data, hiding failure, or losing auditability?

Average engineers ask:

Can we roll back?

Strong engineers ask:

Is rollback semantically safe after this migration/event/workflow change, or do we need roll-forward?

CI/CD for microservices is not about speed alone.

It is about safe speed.

43. Key Takeaways

  • Independent deployability is an architecture property enforced through CI/CD.
  • Build once and promote the same immutable artifact.
  • Contract verification is central to service autonomy.
  • Deployment and release should be separated.
  • Progressive delivery reduces blast radius.
  • Rollback is not always safe; roll-forward often matters more.
  • Pipeline gates should be risk-based, not one-size-fits-all.
  • A pipeline should produce evidence, not just artifacts.
  • Governance should be executable where possible.
  • The best CI/CD system helps teams move faster because it makes unsafe changes visible early.

References

Lesson Recap

You just completed lesson 73 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.