Deepen PracticeOrdered learning track

Multi-Tenancy and Enterprise Segmentation

Learn Enterprise CPQ OMS Camunda 7 - Part 051

Multi-tenancy and enterprise segmentation design for a production-grade Java microservices CPQ and OMS platform, covering tenant context, catalog segmentation, pricing segmentation, authorization, PostgreSQL, Kafka, Redis, Camunda 7, audit, testing, and operational boundaries.

19 min read3692 words
PrevNext
Lesson 5164 lesson track3653 Deepen Practice
#java#microservices#cpq#oms+9 more

Part 051 — Multi-Tenancy and Enterprise Segmentation

Multi-tenancy is not a column.

It is a promise.

The promise is simple:

A user, process, event, cache entry, task, report, document, search result, and operational action from one tenant must not accidentally observe, mutate, infer, or depend on another tenant's data.

In CPQ/OMS, this is harder than ordinary SaaS CRUD.

A quote can depend on:

  • product catalog visibility
  • customer/account ownership
  • channel
  • region
  • sales organization
  • price book
  • discount entitlement
  • approval authority
  • workflow task assignment
  • inventory boundary
  • billing-account boundary
  • document visibility
  • audit/legal retention rules

That means tenant_id is necessary, but not sufficient.

A production CPQ/OMS needs two related models:

  1. Tenant isolation — who owns the data and where isolation must be enforced.
  2. Enterprise segmentation — which commercial, organizational, legal, and operational segments affect behavior inside a tenant.

If you collapse both into a single tenant_id, the platform becomes simple but wrong.

If you model every dimension as a tenant, the platform becomes fragmented and impossible to operate.

The skill is knowing the difference.


1. Core Mental Model

A tenant is the top-level isolation boundary.

A segment is a behavior-selection boundary.

They answer different questions.

ConceptQuestionExample
TenantWhose data is this?acme-id, globex-id
Legal entityWhich legal party owns this transaction?ACME_US, ACME_SG
Sales organizationWhich sales structure governs selling?ENT_APAC, SMB_EU
ChannelThrough which channel is this sold?direct, partner, self_service
RegionWhich regional policy applies?ID, SG, EU
BrandWhich brand/commercial catalog applies?premium, standard
Customer segmentWhich customer class affects pricing/policy?enterprise, public_sector
Product segmentWhich product family or line of business applies?connectivity, cloud, managed_service
Operational segmentWhich operations team handles work?tier_1_ops, enterprise_fulfillment

The invariant:

Tenant isolation prevents cross-customer leakage. Segmentation selects allowed behavior inside the tenant boundary.

Example:

Tenant: ACME
  Legal Entity: ACME Indonesia
  Sales Org: Enterprise APAC
  Channel: Direct Sales
  Customer Segment: Strategic Enterprise
  Catalog Segment: Connectivity Premium
  Price Segment: APAC Enterprise 2026
  Approval Segment: Regional VP Approval Matrix

This is one tenant context with many segment dimensions.


2. The Dangerous Simplification

Bad design:

tenant_id = region

or:

tenant_id = customer_id

or:

tenant_id = sales_org

This feels practical at first.

Then the failures arrive.

2.1 If Region Becomes Tenant

A global enterprise customer with activity in multiple regions becomes multiple tenants.

You lose:

  • global customer visibility
  • global contract consistency
  • cross-region renewal handling
  • consolidated approval trace
  • global account reporting

2.2 If Customer Becomes Tenant

A holding company with many subsidiaries becomes hard to model.

You lose:

  • parent-child account hierarchy
  • shared contract terms
  • account-level entitlement
  • shared quote templates
  • central procurement workflow

2.3 If Sales Org Becomes Tenant

A quote moving between sales organizations becomes cross-tenant migration.

You lose:

  • clean account ownership transfer
  • unified audit
  • consistent document access
  • operational visibility

Top-level tenant must be stable.

Segments can change.

That single difference matters.


3. Tenant Boundary Decision

For this CPQ/OMS series, use this default:

tenant_id is the highest isolation boundary for customer-owned or organization-owned business data. Segments are separate attributes resolved through policy and master data.

A tenant may represent:

  • one enterprise customer in a B2B SaaS model
  • one internal business unit in an internal enterprise platform
  • one market/operator in a telecom-style deployment
  • one reseller/partner in a partner-operated CPQ model

Do not decide by data model convenience.

Decide by these questions:

Decision QuestionIf yes, candidate for tenant boundary
Must this party never see another party's quote/order/audit data?Yes
Can this party have separate encryption/retention/legal controls?Yes
Can this party be onboarded/offboarded independently?Yes
Can this party have separate catalog/pricing/admin configuration?Sometimes
Can this party share customers/contracts with others?Usually no
Can operations support this boundary during incidents?Must be yes

A tenant boundary that operations cannot support is not a real boundary.


4. Isolation Models

There are four practical isolation models.

4.1 Shared Database, Shared Schema

All tenants share tables.

Every tenant-owned row has tenant_id.

Good for:

  • many tenants
  • cost efficiency
  • centralized operations
  • shared reporting infrastructure
  • simple deployment

Hard parts:

  • cross-tenant leakage risk
  • every query must enforce tenant scope
  • index design must include tenant dimensions
  • noisy-neighbor risk
  • tenant-level restore is harder

This is the default for most SaaS-like CPQ/OMS systems.

But it demands discipline.

4.2 Shared Database, Separate Schema

Each tenant has its own schema.

Good for:

  • stronger data separation
  • easier tenant export/restore
  • easier per-tenant migration in some cases

Hard parts:

  • many schemas create operational complexity
  • migrations become slower
  • cross-tenant reporting becomes harder
  • connection/pool/schema routing gets complicated

4.3 Separate Database Per Tenant

Each tenant has a dedicated database.

Good for:

  • strong isolation
  • regulatory/compliance separation
  • easier per-tenant backup/restore
  • noisy-neighbor mitigation

Hard parts:

  • expensive operations
  • tenant provisioning complexity
  • migrations multiply
  • central analytics/read models need aggregation

4.4 Separate Runtime Per Tenant

Each tenant has separate services or Camunda engines.

Good for:

  • high-value tenants
  • strict isolation
  • custom release windows
  • contractual separation

Hard parts:

  • operational overhead
  • version drift
  • duplicated monitoring
  • harder fleet governance

Camunda 7 explicitly supports two multi-tenancy approaches: one process engine per tenant, or one engine using tenant identifiers. In the tenant-identifier approach, tenant data is stored in the same schema and isolation is provided by a tenant identifier column; Camunda also warns that transparent tenant separation is not implemented for all APIs, so not every Camunda API should be directly exposed to tenants.

Reference: https://docs.camunda.org/manual/7.24/user-guide/process-engine/multi-tenancy/


For the reference implementation:

Business services: shared DB + shared schema + tenant_id
Camunda 7: single engine + tenant identifiers, hidden behind Workflow Service
Kafka: shared topics partitioned by aggregate, tenant_id in envelope
Redis: shared cluster, tenant-prefixed keys
Search/read model: shared index/table with tenant filter as mandatory predicate
Documents: shared object store bucket, tenant-scoped object keys and metadata
Admin control plane: tenant-aware configuration with approval/audit

This gives a realistic enterprise baseline.

Then allow exceptions:

ExceptionEscalated Isolation
Regulated tenant requires data residencyseparate DB/schema/runtime region
Very large tenant causes noisy-neighbor riskdedicated pools/topics/cache/runtime
Tenant requires custom workflowtenant-specific process definition version
Tenant requires custom catalog/pricingtenant-specific segment config, not separate codebase
Tenant requires contractual audit exporttenant-scoped archive/export pipeline

Default shared.

Escalate with evidence.


6. Tenant Context

Every request must produce a tenant context.

Not just a string.

A useful tenant context contains:

public record TenantContext(
    String tenantId,
    String principalId,
    Set<String> roles,
    Set<String> groups,
    Set<String> tenantMemberships,
    String legalEntityId,
    String salesOrgId,
    String channelId,
    String regionCode,
    String customerSegment,
    String correlationId
) {}

But do not blindly accept all values from the client.

Client can propose.

Server must resolve.

6.1 Tenant Resolution Sources

SourceTrust LevelUsage
JWT tenant claimMedium/high if issued by trusted IdPmembership hint
URL path /tenants/{tenantId}Low by itselfrequested scope
Header X-Tenant-IdLow by itselfinternal routing only after auth
Account ownership tableHighobject-level verification
Partner delegation tableHighdelegated access
Admin session selectionMediumexplicit tenant switching
Service credentialHigh for service identity, low for object authorizationbackend task routing

Correct rule:

A tenant id in a request selects intended scope; it does not prove authorization.

The server must verify:

principal can access tenant
principal can access object
object belongs to tenant
operation is allowed in current state
operation is allowed by segment policy

All five are needed.


7. Tenant Context Propagation

Tenant context must cross all boundaries.

Required propagation fields:

BoundaryRequired Tenant Fields
HTTPtenant id, principal id, correlation id, auth claims
Domain commandtenant id, actor, authority snapshot
Database rowtenant id on tenant-owned tables
Kafka eventtenant id in envelope and payload metadata
Redis keytenant prefix + environment + service namespace
Camunda processtenant id, business key, process variable contract
Document artifacttenant id, quote/order id, access classification
Audit recordtenant id, actor, segment evidence
Search projectiontenant id + authorization materialized fields

If any boundary drops tenant context, that boundary becomes a leak point.


8. PostgreSQL Tenant Design

For shared schema, every tenant-owned table must have tenant_id.

But the deeper rule is:

Foreign keys should preserve tenant containment.

Bad:

CREATE TABLE quote_line (
  id uuid PRIMARY KEY,
  quote_revision_id uuid NOT NULL REFERENCES quote_revision(id),
  tenant_id text NOT NULL
);

This allows accidental cross-tenant relation if quote_revision_id points to another tenant.

Better:

CREATE TABLE quote_revision (
  tenant_id text NOT NULL,
  id uuid NOT NULL,
  quote_id uuid NOT NULL,
  revision_no int NOT NULL,
  state text NOT NULL,
  PRIMARY KEY (tenant_id, id),
  UNIQUE (tenant_id, quote_id, revision_no)
);

CREATE TABLE quote_line (
  tenant_id text NOT NULL,
  id uuid NOT NULL,
  quote_revision_id uuid NOT NULL,
  line_no int NOT NULL,
  PRIMARY KEY (tenant_id, id),
  FOREIGN KEY (tenant_id, quote_revision_id)
    REFERENCES quote_revision(tenant_id, id)
);

Now the database itself prevents cross-tenant child linkage.

8.1 Index Rule

For tenant-owned operational queries:

CREATE INDEX idx_quote_revision_tenant_state_updated
ON quote_revision (tenant_id, state, updated_at DESC);

CREATE INDEX idx_order_tenant_state_created
ON customer_order (tenant_id, state, created_at DESC);

Do not index only state for high-volume multi-tenant tables.

Tenant is usually the leading filter.

8.2 Tenant-Aware Unique Constraints

Wrong:

UNIQUE (quote_number)

Right:

UNIQUE (tenant_id, quote_number)

Unless quote numbers are globally unique by regulatory requirement.

8.3 Row-Level Security

PostgreSQL Row-Level Security can be useful.

But do not use it as the only boundary.

Use it as defense-in-depth when:

  • all application sessions reliably set tenant context
  • migrations and batch jobs are controlled
  • reporting roles are understood
  • operations teams know how RLS affects debugging

Application-level authorization still remains mandatory.

8.4 Tenant-Scoped Audit

Every audit record should include:

tenant_id
object_type
object_id
actor_type
actor_id
operation
before_summary
after_summary
reason_code
correlation_id
occurred_at

Audit without tenant id is operationally incomplete.


JPA can hide SQL.

That is convenient.

It is also dangerous.

The repository layer must enforce tenant explicitly.

public Optional<QuoteRevision> findById(TenantContext tenant, UUID quoteRevisionId) {
    return em.createQuery("""
        select q
        from QuoteRevision q
        where q.tenantId = :tenantId
          and q.id = :id
        """, QuoteRevision.class)
      .setParameter("tenantId", tenant.tenantId())
      .setParameter("id", quoteRevisionId)
      .getResultStream()
      .findFirst();
}

Do not expose generic repository methods like:

findById(UUID id)

for tenant-owned aggregates.

The method signature should make tenant omission impossible.

9.1 Aggregate Root Rule

Tenant id is part of every aggregate root identity.

public record QuoteRevisionId(String tenantId, UUID id) {}
public record OrderId(String tenantId, UUID id) {}
public record ProductOfferingId(String tenantId, String code, int version) {}

Even if your database uses UUID primary keys, your domain identity should still carry tenant.

This prevents accidental cross-tenant calls in code.

9.2 No Cross-Tenant Entity Graphs

Do not let one entity association cross tenant boundaries.

A quote line should not reference a product offering entity from a different tenant through ORM association.

Use snapshots:

@Embeddable
public class ProductOfferingSnapshot {
    private String tenantId;
    private String offeringCode;
    private int catalogVersion;
    private String displayName;
    private String segmentCode;
}

Snapshot beats live cross-tenant lookup.


10. Catalog Segmentation

Catalog segmentation decides what can be sold.

It should answer:

Given tenant, legal entity, channel, region, customer segment, and date,
which product offerings are visible and eligible?

Do not encode this as UI filtering only.

Eligibility must be enforced server-side.

10.1 Catalog Visibility Model

Minimum fields:

CREATE TABLE catalog_segment_rule (
  tenant_id text NOT NULL,
  id uuid NOT NULL,
  catalog_publication_id uuid NOT NULL,
  product_offering_code text NOT NULL,
  sales_org_id text,
  channel_id text,
  region_code text,
  customer_segment text,
  effective_from timestamptz NOT NULL,
  effective_to timestamptz,
  status text NOT NULL,
  PRIMARY KEY (tenant_id, id)
);

10.2 Visibility vs Eligibility

Separate them.

Visibility:

Can the user see this offering in the catalog?

Eligibility:

Can this offering be selected for this customer/context?

Example:

  • A sales rep may see an enterprise connectivity package.
  • The customer may be ineligible because service area is unsupported.

Do not hide all ineligible products by default.

In enterprise CPQ, explainability matters.

Sometimes the system should show:

Not eligible: customer location outside supported coverage area.

That is better than a disappearing option.


11. Pricing Segmentation

Pricing segmentation decides what commercial rules apply.

It should answer:

Given tenant, price segment, customer segment, channel, region, product, date,
which price book and discount policy version applies?

Pricing is more sensitive than catalog visibility.

A cross-segment price leak can expose confidential commercial strategy.

11.1 Price Segment Model

price_segment = tenant + legal entity + sales org + channel + region + customer class + product family + effective date

Not every dimension must be populated.

But every decision must record which dimensions were used.

11.2 Price Decision Evidence

A price result should carry:

{
  "tenantId": "acme",
  "quoteId": "...",
  "priceBookVersion": "APAC-ENT-2026-v7",
  "priceSegment": "APAC_ENTERPRISE_DIRECT",
  "customerSegment": "STRATEGIC_ENTERPRISE",
  "channel": "DIRECT",
  "region": "ID",
  "effectiveAt": "2026-07-02T00:00:00Z",
  "components": []
}

This is required for reproducibility.

Without segment evidence, you cannot answer:

Why did this customer get this price?


12. Approval Segmentation

Approval policy is not global.

A 20% discount may be normal in one market and suspicious in another.

Approval policy depends on:

  • tenant
  • sales org
  • legal entity
  • channel
  • customer segment
  • product family
  • contract term
  • margin band
  • discount authority
  • risk class

12.1 Approval Authority Scope

CREATE TABLE approval_authority_scope (
  tenant_id text NOT NULL,
  id uuid NOT NULL,
  principal_id text NOT NULL,
  sales_org_id text,
  region_code text,
  customer_segment text,
  product_family text,
  max_discount_percent numeric(9,4),
  max_contract_value numeric(19,4),
  effective_from timestamptz NOT NULL,
  effective_to timestamptz,
  PRIMARY KEY (tenant_id, id)
);

The scope must be snapshotted into approval evidence.

Do not merely store:

approved_by = alice

Store:

approved_by = alice
approved_under_scope = Regional VP APAC Enterprise v3
authority_evidence_hash = ...

Authority can change later.

The quote must remain defensible.


13. Order Segmentation

Order segmentation decides who can fulfill and operate the order.

It depends on:

  • tenant
  • legal entity
  • fulfillment region
  • product family
  • inventory domain
  • partner/reseller responsibility
  • operations queue
  • support tier

Do not use quote owner as fulfillment owner.

Sales ownership and operations ownership are different.

13.1 Operations Queue Mapping

order line product family + region + partner flag -> fulfillment group

Example:

{
  "tenantId": "acme",
  "orderId": "...",
  "lineId": "...",
  "productFamily": "connectivity",
  "region": "ID-JKT",
  "requiresFieldWork": true,
  "fulfillmentGroup": "ID_CONNECTIVITY_FIELD_OPS"
}

This mapping must be versioned.

A line created yesterday should not silently move queues because config changed today.


14. Kafka Tenant Boundary

Kafka events must include tenant context.

Minimum envelope:

{
  "eventId": "...",
  "eventType": "quote.approved.v1",
  "tenantId": "acme",
  "aggregateType": "quote",
  "aggregateId": "...",
  "aggregateVersion": 17,
  "occurredAt": "2026-07-02T10:00:00Z",
  "correlationId": "...",
  "causationId": "...",
  "dataClassification": "commercial-confidential",
  "payload": {}
}

14.1 Topic Strategy

StrategyGood ForRisk
Shared topic with tenant in envelopemany tenants, low ops overheadconsumers must enforce tenant filtering
Topic per tenantstrict isolation, high-value tenantstopic explosion, harder fleet ops
Topic per domain and regiondata residency/regional opsrouting complexity
Dedicated cluster per tenantextreme isolationexpensive and operationally heavy

Default:

cpq.quote.events.v1
cpq.order.events.v1
cpq.catalog.events.v1
cpq.workflow.events.v1

with mandatory tenantId in envelope.

Escalate to tenant-specific topics only for evidence-based isolation requirements.

14.2 Consumer Rule

Every consumer must either:

  1. process all tenants safely, or
  2. explicitly allowlist tenants.

Never rely on accidental filtering.

if (!tenantAccessPolicy.canProcess(event.tenantId())) {
    metrics.increment("event.skipped.unauthorized_tenant");
    return;
}

This matters for projection workers, notification workers, document workers, and workflow workers.


15. Redis Tenant Boundary

Redis keys must be tenant-scoped.

Bad:

quote-preview:123
catalog:offering:VPN_BASIC
idempotency:submit-order:abc

Good:

prod:quote-service:tenant:acme:quote-preview:123
prod:catalog-service:tenant:acme:offering:VPN_BASIC:v42
prod:quote-service:tenant:acme:idempotency:submit-order:abc

15.1 Key Prefix Contract

<env>:<service>:tenant:<tenantId>:<domain>:<key>:<version>

A key without tenant should be reviewed.

Allowed global keys are rare:

  • public static reference data
  • service health marker
  • global rate-limit metadata
  • distributed configuration version

Even then, document it.

15.2 Redis Must Not Decide Tenant Authority

Redis can cache entitlement result.

Redis must not be the source of entitlement truth.

Correct:

DB/policy engine decides -> Redis caches short-lived result -> DB/policy can invalidate

Wrong:

Redis key exists -> user is authorized

16. Camunda 7 Tenant Boundary

There are two practical approaches for this platform.

16.1 Single Engine With Tenant Identifiers

Use when:

  • many tenants
  • similar workflow definitions
  • shared operations team
  • moderate isolation requirement

Pattern:

runtimeService
  .createProcessInstanceByKey("order_fulfillment")
  .processDefinitionTenantId(tenantId)
  .businessKey("order:" + tenantId + ":" + orderId)
  .setVariables(variables)
  .execute();

Key rules:

  • tenant id must be explicit when starting process instances
  • business key must include tenant-aware business identity
  • task queries must be filtered by tenant or hidden behind custom service APIs
  • do not expose raw Camunda REST APIs to tenant users
  • admin tools require strict operator authorization

16.2 One Engine Per Tenant

Use when:

  • strict isolation is required
  • tenant-specific process definitions are heavy
  • tenant runtime lifecycle differs
  • regulatory or contractual boundary requires separation

But do not choose this casually.

It increases:

  • deployment complexity
  • migration complexity
  • job executor tuning complexity
  • monitoring complexity
  • process definition drift

Camunda 7 documentation describes one-engine-per-tenant as a supported multi-tenancy approach and notes that each process engine can use a different data source, schema, or table prefix. It also notes shared job execution options for multiple engines.

Reference: https://docs.camunda.org/manual/7.24/user-guide/process-engine/multi-tenancy/

16.3 Custom Workflow Service Boundary

The CPQ/OMS services should not call raw Camunda tenant APIs directly everywhere.

Use a Workflow Service facade:

POST /internal/workflows/order-fulfillment/start
POST /internal/workflows/quote-approval/start
POST /internal/workflows/messages/inventory-reserved/correlate
GET  /internal/workflows/tasks?tenantId=...

The facade enforces:

  • tenant authorization
  • process key allowlist
  • variable schema
  • business key format
  • audit
  • correlation id
  • operator permission
  • migration fence

17. Search and Reporting Tenant Boundary

Search is one of the easiest places to leak data.

Because search often optimizes for relevance, not authorization.

Rule:

Authorization must be applied before ranking and before pagination.

Bad:

search all quotes -> rank -> paginate -> filter unauthorized results

This creates empty pages and can leak counts.

Better:

apply tenant + access predicate -> rank -> paginate

17.1 Projection Fields

Quote search projection should include:

tenant_id
quote_id
quote_number
customer_id
owner_id
sales_org_id
region_code
state
approval_state
authorized_group_ids
visibility_scope
updated_at

Do not force search service to call Quote Service for every row to know visibility.

Materialize enough authorization fields into the projection.

But keep them derived and rebuildable.


18. Document and Artifact Tenant Boundary

Documents are high-risk.

A quote PDF can expose:

  • prices
  • discounts
  • customer identity
  • contract terms
  • approval conditions
  • internal notes if generated incorrectly

Artifact key design:

s3://cpq-artifacts/prod/tenant/acme/quote/quote-123/rev/4/artifact/proposal-v2.pdf

Metadata:

{
  "tenantId": "acme",
  "artifactType": "quote-proposal",
  "quoteId": "quote-123",
  "quoteRevision": 4,
  "classification": "commercial-confidential",
  "createdBy": "document-service",
  "contentHash": "..."
}

Access must be authorized through application service.

Do not expose raw object store paths as permanent public URLs unless signed and time-limited.


19. Admin Tenant Boundary

Admin users are dangerous because they cross ordinary user boundaries.

Define admin scopes:

Admin ScopePermission
Tenant Adminmanage users/config for one tenant
Catalog Adminpublish catalog for tenant/segment
Pricing Adminmanage price books for tenant/segment
Policy Adminmanage approval/entitlement rules
Workflow Admindeploy/suspend/migrate process definitions
Operatorretry jobs, manage incidents, inspect health
Super Admincross-tenant platform administration

Never treat admin=true as enough.

An operator who can retry a workflow job may indirectly trigger fulfillment.

A workflow admin who can update variables may alter process behavior.

Camunda 7 authorization documentation explicitly warns that variable update permissions can trigger other changes in the process, for example by causing conditional events to evaluate successfully.

Reference: https://docs.camunda.org/manual/7.24/user-guide/process-engine/authorization-service/


20. Tenant Provisioning Lifecycle

Tenant provisioning is not a row insert.

Lifecycle:

Provisioning must create or configure:

  • tenant master record
  • legal entities
  • user/group mappings
  • catalog baseline
  • price baseline
  • approval policy baseline
  • workflow deployments or tenant identifiers
  • Redis namespace policy
  • Kafka consumer access if isolated
  • document storage prefix
  • search projection namespace
  • audit retention rules
  • default admin scopes

Do not activate tenant until baseline config is complete.

20.1 Tenant Activation Checklist

[ ] tenant master exists
[ ] identity provider mapping tested
[ ] user/group/role mappings loaded
[ ] legal entity configured
[ ] catalog publication active
[ ] price book active
[ ] approval policy active
[ ] workflow process definitions deployed or shared definitions mapped
[ ] notification templates configured
[ ] document artifact prefix configured
[ ] audit retention configured
[ ] tenant health check passes
[ ] sample quote create/configure/price/approve flow passes

21. Tenant Suspension

Suspension must be explicit.

A suspended tenant may allow:

  • read-only access
  • payment/billing reconciliation
  • operator export
  • audit access

but block:

  • new quote creation
  • quote acceptance
  • order submission
  • catalog publication
  • price book publication
  • workflow process start

Suspension policy:

ACTIVE:
  all authorized actions allowed

SUSPENDED_READ_ONLY:
  reads allowed
  new commercial commitments blocked
  workflow recovery allowed by operator

SUSPENDED_SECURITY:
  reads blocked except security/admin
  active sessions revoked
  workflow side effects paused

DECOMMISSIONING:
  only export, archive, retention, legal hold operations allowed

Do not implement tenant suspension only at UI level.

Every command handler must check tenant operational state.


22. Tenant-Aware Idempotency

Idempotency keys are tenant-scoped.

Bad:

idempotency_key = requestId

Better:

idempotency_key = tenantId + actorId + commandType + clientRequestId

Database table:

CREATE TABLE idempotency_record (
  tenant_id text NOT NULL,
  actor_id text NOT NULL,
  command_type text NOT NULL,
  idempotency_key text NOT NULL,
  request_hash text NOT NULL,
  response_hash text,
  status text NOT NULL,
  created_at timestamptz NOT NULL,
  expires_at timestamptz NOT NULL,
  PRIMARY KEY (tenant_id, actor_id, command_type, idempotency_key)
);

A client in tenant A must not collide with tenant B.


23. Cross-Tenant Operations

Some operations are legitimately cross-tenant:

  • platform-wide health dashboard
  • security investigation
  • global catalog template management
  • fleet migration
  • analytics aggregation
  • global audit export
  • support impersonation

Treat these as privileged workflows.

Do not implement them by silently omitting tenant filters.

Cross-tenant operation must require:

explicit cross_tenant_scope
reason code
operator identity
approval when high risk
time-limited access
full audit
read-only default

23.1 Support Impersonation

Support impersonation is especially dangerous.

Safer model:

operator requests session -> approval or policy grants -> scoped session created -> all actions audited as operator acting for user/tenant

Never overwrite actor identity with impersonated user only.

Audit must show both:

actor = support.operator@company
acting_as = user@tenant
reason = CASE-12345
scope = quote-read-only
expires_at = ...

24. Multi-Tenant Threat Model

Use this checklist.

ThreatExampleControl
Broken object-level authorizationuser changes quote id in URLobject ownership check
Missing tenant filtersearch query returns other tenant rowsmandatory tenant predicate
Cache leakageRedis key missing tenant prefixkey namespace contract
Event leakagenotification consumer processes wrong tenant eventtenant allowlist + envelope
Workflow leakagetask query not tenant filteredWorkflow Service facade
Admin overreachtenant admin edits global pricingscoped admin roles
Report leakageaggregate report includes hidden tenantreport authorization layer
Document leakageshared artifact URL guessedsigned URL + metadata auth
Log leakagelogs include confidential price dataredaction + classification
Test leakagefixtures mix tenant datascenario catalog tenant assertions

OWASP's multi-tenant guidance focuses on tenant isolation and preventing cross-tenant attacks, and OWASP API Security identifies object-level authorization as a critical API risk. In CPQ/OMS, both concerns are central because object identifiers are everywhere: quotes, orders, tasks, documents, accounts, and workflow instances.

References:


25. Tenant Isolation Tests

Tenant isolation needs dedicated tests.

Not incidental tests.

25.1 API Test

Given tenant A user
And tenant B quote exists
When user requests /quotes/{tenantBQuoteId}
Then response is 404 or 403 according to disclosure policy
And audit records denied access

Disclosure policy matters.

For sensitive objects, prefer 404 to avoid confirming existence.

25.2 Repository Test

Given same UUID-like external reference exists under two tenants
When repository loads by tenant A
Then only tenant A row is returned

25.3 Search Test

Given quote with matching search term in tenant B
When tenant A user searches same term
Then tenant B result is not counted, ranked, or paginated

25.4 Kafka Consumer Test

Given consumer is configured for tenant A
When tenant B event is received
Then consumer skips event and increments unauthorized tenant metric

25.5 Redis Test

Given tenant A and tenant B have same quote id
When preview cache is written for both
Then keys are distinct

25.6 Camunda Task Test

Given task exists for tenant B
When tenant A user queries worklist
Then task is not visible
And raw Camunda task id is never exposed as sufficient access token

26. Operational Metrics

Track tenant-aware metrics.

api_requests_total{tenant,service,operation,status}
command_denied_total{tenant,reason}
cross_tenant_access_denied_total{sourceTenant,targetTenant,operation}
quote_created_total{tenant,channel,region}
order_failed_total{tenant,fulfillmentGroup,reason}
workflow_incidents_total{tenant,processKey}
projection_lag_seconds{tenant,projection}
redis_hit_ratio{tenant,cacheName}
consumer_lag{tenant,consumerGroup,topic}
tenant_suspended_command_blocked_total{tenant,operation}

Be careful with metric cardinality.

If tenant count is huge, use tenant tier or sampled tenant labels for high-volume metrics.

But security denial metrics should preserve enough forensic detail in logs/audit.


27. Design Smells

Smell 1: findById(id) for Tenant-Owned Object

Should be:

findById(tenantId, id)

Smell 2: Tenant in UI Only

Server must enforce tenant.

UI filtering is convenience, not security.

Smell 3: Shared Redis Key Without Tenant

Almost always wrong.

Smell 4: Raw Camunda REST Exposed to Users

Dangerous unless Camunda authorization and tenant restrictions are fully configured and understood.

Smell 5: Admin Role Without Scope

ADMIN is not enough.

Use scoped permissions.

Smell 6: Search Filter After Pagination

Authorization must happen before ranking and pagination.

Smell 7: Segment Is Treated as Tenant

Segments are behavior selectors.

Tenants are isolation boundaries.

Smell 8: Tenant-Specific Forked Codebase

Configuration and policy should vary.

Core code should remain shared unless isolation requires runtime split.


28. Reference Tenant Context Flow

No boundary relies on the previous boundary blindly.

Each one reasserts the tenant.


29. Minimal Implementation Plan

Step 1 — Tenant Master

Create:

tenant
tenant_state
tenant_segment_config
tenant_user_membership
tenant_admin_scope

Step 2 — Tenant Context Filter

In JAX-RS/Jersey:

  • authenticate principal
  • resolve requested tenant
  • verify membership
  • attach TenantContext to request scope
  • add correlation id

Step 3 — Repository Signatures

For tenant-owned aggregates:

findById(TenantContext tenant, Id id)
save(TenantContext tenant, Aggregate aggregate)
search(TenantContext tenant, Query query)

No generic tenant-less access.

Step 4 — Database Constraints

  • tenant_id on tenant-owned tables
  • composite primary/foreign keys where practical
  • tenant-leading indexes
  • tenant-aware uniqueness

Step 5 — Event Envelope

Require tenantId in every event envelope.

Reject events without tenant unless explicitly global.

Step 6 — Redis Namespace

Implement key builder.

No manual string keys in business code.

Step 7 — Workflow Facade

All Camunda operations go through Workflow Service.

No raw Camunda exposure to CPQ UI.

Step 8 — Tenant Isolation Test Suite

Add cross-tenant negative tests for every API and projection.


30. Production Checklist

[ ] tenant id is present on all tenant-owned tables
[ ] tenant-aware foreign keys prevent cross-tenant child linkage
[ ] tenant-leading indexes exist for operational queries
[ ] repository APIs require tenant context
[ ] object-level authorization verifies tenant ownership
[ ] search applies tenant/access predicate before ranking and pagination
[ ] Redis keys use tenant namespace builder
[ ] Kafka events include tenant id and classification
[ ] consumers enforce tenant scope or explicit all-tenant permission
[ ] Camunda process start/correlation includes tenant-aware business key
[ ] raw Camunda APIs are not exposed to tenant users
[ ] admin roles are scoped and audited
[ ] document artifacts are tenant-scoped and access-controlled
[ ] tenant provisioning has activation checklist
[ ] tenant suspension blocks commercial commands server-side
[ ] cross-tenant operations require reason code and audit
[ ] isolation tests cover API, DB, search, Redis, Kafka, Camunda, documents

Closing

Multi-tenancy is not achieved by adding tenant_id.

That is only the first move.

A production CPQ/OMS must carry tenant context through:

  • HTTP
  • domain commands
  • database constraints
  • event envelopes
  • cache keys
  • workflow instances
  • documents
  • read models
  • audit records
  • admin operations

The deeper design principle is this:

Isolation must be enforced at every boundary where data is read, changed, cached, emitted, rendered, searched, or operated.

Enterprise segmentation is equally important, but different.

Segments choose behavior.

Tenants enforce isolation.

Confusing the two creates either security leaks or operational chaos.

Lesson Recap

You just completed lesson 51 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.