Series/Learn Java Persistence, Database Integration, and JPA

Deepen PracticeOrdered learning track

Multitenancy and Data Partitioning

Learn Java Persistence, Database Integration, and JPA - Part 029

Multitenancy dan data partitioning dalam Java persistence: database-per-tenant, schema-per-tenant, shared-schema discriminator, tenant context, Hibernate multitenancy, tenant leak prevention, migration, testing, dan production failure modes.

[2026-06-30]17 min read3368 words

In This Lesson

1. Kaufman Deconstruction: What Skill Are We Practicing?2. Core Mental Model 3. Tenancy Is Not Always Customer Tenancy

PrevNext

Lesson 2935 lesson track20–29 Deepen Practice

#java#persistence#jpa#hibernate+5 more

Part 029 — Multitenancy and Data Partitioning

Multitenancy is not a feature you add with a tenant_id column. It is a data isolation model.

A senior engineer treats multitenancy as a combination of:

identity boundary — who is the tenant?
authorization boundary — may this actor access this tenant?
routing boundary — which database/schema/partition should serve this request?
query boundary — can every query prove it is tenant-scoped?
migration boundary — can schema/data changes be rolled out safely per tenant?
operational boundary — can one tenant be backed up, restored, throttled, migrated, or debugged without harming others?

The dangerous misconception is this:

“Multitenancy means adding tenant_id to every table.”

That is only one implementation strategy. The real design question is:

“What blast radius do we accept when tenant data, traffic, migration, failure, or breach happens?”

This part focuses on production-grade multitenancy in Java persistence with JPA/Hibernate/Spring Data JPA.

1. Kaufman Deconstruction: What Skill Are We Practicing?

Using Kaufman's skill acquisition lens, multitenancy should be deconstructed into small subskills.

Subskill	What You Must Be Able To Do
Tenant model design	Define tenant, account, organization, workspace, region, subscription, and legal entity correctly
Isolation strategy selection	Choose database-per-tenant, schema-per-tenant, shared-schema, or hybrid partitioning
Tenant context propagation	Carry tenant identity safely across HTTP, async, scheduler, messaging, and transaction boundaries
Query enforcement	Prevent cross-tenant reads/writes by construction, not by developer discipline
Migration strategy	Run schema and data migrations safely for one, many, or all tenants
Performance engineering	Prevent hot tenants, noisy neighbors, bad partition keys, and global index bottlenecks
Security modelling	Detect tenant spoofing, confused deputy, IDOR, and tenant-leak failure modes
Testing discipline	Prove tenant isolation with integration, mutation, and adversarial tests

The target performance level:

Given a Java service with persistence requirements, you can design a multitenancy model, choose an isolation strategy, implement tenant-safe data access, test leakage scenarios, and operate the model during migration and incident response.

2. Core Mental Model

Multitenancy is a scoping invariant.

Every persisted fact must answer:

Who owns this data?
Who may observe it?
Who may mutate it?
Where is it physically stored?
How is access scoped by default?
What happens if scoping fails?

A tenant-safe persistence system must make the safe path the default path.

Bad design:

repository.findById(id); // hope the caller checks tenant elsewhere

Better design:

repository.findByTenantIdAndId(tenantId, id);

Stronger design:

TenantScopedEntityManager.forCurrentTenant()
    .find(Order.class, orderId);

Strongest design depends on your model, but the principle is stable:

Tenant scoping should be enforced at the lowest reliable boundary, then repeated at higher boundaries for defense in depth.

3. Tenancy Is Not Always Customer Tenancy

Before choosing database layout, define what a tenant means.

Term	Possible Meaning	Persistence Consequence
Tenant	Paying customer, organization, workspace, government agency, legal entity	Primary data isolation boundary
Account	Billing/subscription owner	May own multiple tenants/workspaces
User	Human/principal	Usually belongs to one or more tenants
Role	Authorization within tenant	Must be scoped by tenant
Region	Data residency or deployment region	Can force separate storage/routing
Environment	prod/sandbox/test tenant	May require separate lifecycle/purge policy
Partition	Physical distribution unit	May not equal tenant

Common mistake:

user_id == tenant_id

This breaks as soon as:

one user belongs to multiple organizations,
one organization has multiple workspaces,
one parent company owns multiple subsidiaries,
one legal entity requires regional data isolation,
an admin needs delegated access,
background jobs run without a human user.

A robust model separates:

Principal identity
Authorization scope
Tenant ownership
Storage routing

4. Multitenancy Strategy Matrix

There are four common strategies.

Strategy	Isolation	Cost	Operational Complexity	Best For
Database per tenant	Very high	High	High	Regulated customers, large tenants, strong isolation
Schema per tenant	High	Medium-high	Medium-high	Moderate tenant count, stronger isolation than shared schema
Shared schema + tenant discriminator	Lower	Low	Medium	Many small tenants, SaaS scale, simpler ops
Hybrid	Variable	Variable	High	Real systems with mixed tenant sizes/risk profiles

There is no universal best choice.

The correct strategy depends on:

legal isolation requirements,
customer size distribution,
migration frequency,
number of tenants,
noisy neighbor tolerance,
backup/restore requirement,
operational tooling maturity,
reporting/analytics model,
regional/data residency requirements,
cost envelope.

5. Strategy 1: Database Per Tenant

Each tenant has its own database.

5.1 Advantages

Advantage	Why It Matters
Strong blast-radius isolation	Bad query, corruption, or breach can be limited to one tenant DB
Easier tenant backup/restore	Restore one tenant without filtering shared rows
Per-tenant scaling	Large tenant can get stronger hardware, read replica, partitioning
Compliance fit	Easier to prove separation for high-regulation customers
Custom retention	Tenant-specific purge and archival are easier

5.2 Disadvantages

Disadvantage	Consequence
Many connection pools	JVM memory and DB connection pressure increase quickly
Migration fan-out	Every migration must run across many databases
Operational inventory	Need tenant registry, health, version, backup state
Cross-tenant analytics harder	Requires ETL/data warehouse aggregation
Onboarding overhead	Provisioning must be automated

5.3 When It Makes Sense

Use database-per-tenant when:

tenants are few or medium count,
tenants are high value,
data isolation is contractually important,
tenants require custom backup/restore,
tenant data size differs dramatically,
regional/legal boundaries matter,
one tenant failure must not impact others.

Avoid it when:

you expect hundreds of thousands of tiny tenants,
your operations team cannot automate migration/provisioning,
your database license/cost model does not scale per database,
your application cannot manage dynamic datasource routing safely.

6. Strategy 2: Schema Per Tenant

All tenants share a database instance, but each tenant has a separate schema.

6.1 Advantages

Advantage	Why It Matters
Better isolation than shared schema	Table names/indexes are separated per tenant schema
Lower cost than database-per-tenant	One database instance can host many schemas
Easier per-tenant migration than row filtering	Schema version can be tracked per tenant
Backup may be possible per schema	Depends heavily on database vendor/tooling

6.2 Disadvantages

Disadvantage	Consequence
Schema count limits	Some DBs/tools degrade with very many schemas
Migration complexity	Need per-schema migration orchestration
Connection/session state risk	`SET search_path`/current schema must be reset correctly
Shared resource contention	CPU, IO, memory still shared

6.3 Failure Mode: Leaked Schema State

A classic bug:

Request A sets schema = tenant_a
Connection returned to pool
Request B gets same connection
Schema is still tenant_a
Request B reads tenant A data

Any schema routing strategy must guarantee:

before use: set tenant schema
finally: reset/validate schema
on error: discard unsafe connection if state unknown

7. Strategy 3: Shared Schema with Tenant Discriminator

All tenants share tables. Each tenant-owned row has a tenant discriminator column such as tenant_id.

Example:

create table orders (
    tenant_id uuid not null,
    id uuid not null,
    order_number varchar(64) not null,
    status varchar(32) not null,
    created_at timestamptz not null,
    primary key (tenant_id, id),
    unique (tenant_id, order_number)
);

Notice the composite primary key.

A less strict design often does this:

primary key (id),
unique (order_number)

That is usually wrong for shared-schema multitenancy because uniqueness becomes global when it should be tenant-local.

Better:

primary key (tenant_id, id),
unique (tenant_id, order_number)

7.1 Advantages

Advantage	Why It Matters
Lowest infrastructure cost	One schema/table set for many tenants
Easy onboarding	Insert tenant row, no database provisioning
Simple global migrations	One schema migration updates all tenants
Better connection efficiency	One pool can serve all tenants
Simple shared analytics	Rows already co-located

7.2 Disadvantages

Disadvantage	Consequence
Tenant leak risk	A missing predicate can expose another tenant's data
Noisy neighbor risk	Large tenant can affect shared indexes/tables
Harder tenant restore	Need row-level restore/filtering
Harder deletion	Must delete all tenant-owned rows across tables safely
Hot indexes	Global indexes may become large and skewed

Shared schema is cheap operationally but expensive intellectually. It requires intense discipline.

8. Strategy 4: Hybrid Tenancy

Most real SaaS systems eventually become hybrid.

Example:

Small tenants     -> shared schema
Medium tenants    -> schema per tenant
Enterprise tenant -> database per tenant
Regulated tenant  -> region-specific database

Hybrid is powerful but requires a tenant registry.

Tenant registry fields often include:

tenant_id
legal_name
status
plan
region
isolation_mode
datasource_key
schema_name
migration_version
encryption_key_ref
created_at
suspended_at

The registry becomes critical infrastructure. If it is wrong, routing is wrong.

9. Tenant Context Propagation

A tenant context is the application-level representation of current tenant scope.

Example:

public record TenantContext(
    UUID tenantId,
    UUID principalId,
    Set<String> roles,
    String region,
    String correlationId
) {}

Avoid global mutable static state.

This is tempting:

public final class CurrentTenant {
    private static final ThreadLocal<UUID> TENANT = new ThreadLocal<>();

    public static void set(UUID tenantId) {
        TENANT.set(tenantId);
    }

    public static UUID get() {
        return TENANT.get();
    }

    public static void clear() {
        TENANT.remove();
    }
}

ThreadLocal can work in servlet-style request handling, but it is dangerous unless cleared reliably.

The minimal safe wrapper:

public final class TenantContextHolder {
    private static final ThreadLocal<TenantContext> HOLDER = new ThreadLocal<>();

    public static TenantContext require() {
        TenantContext context = HOLDER.get();
        if (context == null) {
            throw new MissingTenantContextException();
        }
        return context;
    }

    public static void set(TenantContext context) {
        if (context == null) {
            throw new IllegalArgumentException("tenant context must not be null");
        }
        HOLDER.set(context);
    }

    public static void clear() {
        HOLDER.remove();
    }

    private TenantContextHolder() {}
}

HTTP filter:

public final class TenantContextFilter extends OncePerRequestFilter {
    private final TenantResolver tenantResolver;

    public TenantContextFilter(TenantResolver tenantResolver) {
        this.tenantResolver = tenantResolver;
    }

    @Override
    protected void doFilterInternal(
        HttpServletRequest request,
        HttpServletResponse response,
        FilterChain filterChain
    ) throws ServletException, IOException {
        try {
            TenantContext context = tenantResolver.resolve(request);
            TenantContextHolder.set(context);
            filterChain.doFilter(request, response);
        } finally {
            TenantContextHolder.clear();
        }
    }
}

Key invariant:

Every entry point that touches tenant-owned data must either set tenant context or explicitly prove that it is tenant-independent.

Entry points include:

HTTP controllers,
GraphQL resolvers,
gRPC handlers,
message consumers,
scheduled jobs,
batch processors,
admin tools,
migration runners,
CLI scripts,
tests.

10. Tenant Context Is Not Authorization

Do not trust tenant ID from request headers blindly.

Bad:

X-Tenant-Id: 3b2a...

Then:

UUID tenantId = UUID.fromString(request.getHeader("X-Tenant-Id"));

This allows tenant spoofing unless the header is protected and validated.

Better:

Authenticate principal
Load memberships/claims
Resolve requested tenant
Verify principal may act in tenant
Build tenant context

public TenantContext resolve(HttpServletRequest request) {
    Principal principal = authenticationService.requirePrincipal();
    UUID requestedTenantId = extractRequestedTenant(request);

    TenantMembership membership = membershipRepository
        .findActiveMembership(principal.id(), requestedTenantId)
        .orElseThrow(ForbiddenTenantAccessException::new);

    return new TenantContext(
        requestedTenantId,
        principal.id(),
        membership.roles(),
        membership.region(),
        correlationIdProvider.current()
    );
}

A tenant context should be derived from trusted identity and authorization state, not merely copied from untrusted input.

11. Shared-Schema Modelling Rules

For shared-schema multitenancy, table design matters more than annotation convenience.

11.1 Tenant-Owned Tables Must Have `tenant_id`

create table cases (
    tenant_id uuid not null,
    id uuid not null,
    case_number varchar(64) not null,
    status varchar(32) not null,
    primary key (tenant_id, id),
    unique (tenant_id, case_number)
);

11.2 Foreign Keys Should Include Tenant Boundary

Bad:

create table case_events (
    tenant_id uuid not null,
    id uuid primary key,
    case_id uuid not null references cases(id)
);

This can allow cross-tenant references if cases.id is globally unique by accident but not enforced by tenant boundary.

Better:

create table case_events (
    tenant_id uuid not null,
    id uuid not null,
    case_id uuid not null,
    event_type varchar(64) not null,
    occurred_at timestamptz not null,
    primary key (tenant_id, id),
    foreign key (tenant_id, case_id) references cases(tenant_id, id)
);

This encodes tenant isolation into the database.

11.3 Unique Constraints Must Be Tenant-Aware

Bad:

unique (email)

Maybe correct for global login identity, but wrong for tenant-local employee records.

Better for tenant-local identity:

unique (tenant_id, email)

11.4 Indexes Must Match Tenant-Scoped Queries

Common query:

select *
from cases
where tenant_id = ?
  and status = ?
order by created_at desc
limit 50;

Useful index:

create index idx_cases_tenant_status_created
on cases (tenant_id, status, created_at desc);

A global index on only status is usually weak in shared-schema SaaS because status has low selectivity and ignores the tenant boundary.

12. JPA Mapping for Shared-Schema Tenancy

A basic tenant-scoped entity:

@MappedSuperclass
public abstract class TenantScopedEntity {

    @Column(name = "tenant_id", nullable = false, updatable = false)
    private UUID tenantId;

    protected TenantScopedEntity() {
    }

    protected TenantScopedEntity(UUID tenantId) {
        this.tenantId = Objects.requireNonNull(tenantId);
    }

    public UUID tenantId() {
        return tenantId;
    }
}

Entity:

@Entity
@Table(
    name = "cases",
    uniqueConstraints = {
        @UniqueConstraint(
            name = "uk_cases_tenant_case_number",
            columnNames = {"tenant_id", "case_number"}
        )
    }
)
public class CaseEntity extends TenantScopedEntity {

    @EmbeddedId
    private CaseEntityId id;

    @Column(name = "case_number", nullable = false, length = 64)
    private String caseNumber;

    @Enumerated(EnumType.STRING)
    @Column(name = "status", nullable = false, length = 32)
    private CaseStatus status;

    protected CaseEntity() {
    }

    public CaseEntity(UUID tenantId, UUID caseId, String caseNumber) {
        super(tenantId);
        this.id = new CaseEntityId(tenantId, caseId);
        this.caseNumber = requireNonBlank(caseNumber);
        this.status = CaseStatus.DRAFT;
    }
}

Composite key:

@Embeddable
public record CaseEntityId(
    @Column(name = "tenant_id") UUID tenantId,
    @Column(name = "id") UUID id
) implements Serializable {}

This is verbose, but it expresses a valuable invariant:

A case ID without tenant ID is not enough to identify a row in shared-schema tenancy.

13. Surrogate ID vs Composite Tenant Key

There are two common models.

13.1 Global Surrogate ID + Tenant Column

id uuid primary key,
tenant_id uuid not null

Pros:

simple JPA mapping,
simple URLs,
simple references,
easier integration with frameworks.

Cons:

database may not prevent cross-tenant FK mistakes,
repository methods can accidentally find by id only,
tenant scoping becomes convention-heavy,
uniqueness constraints must be explicitly tenant-aware.

13.2 Composite Key: `(tenant_id, id)`

Pros:

tenant boundary encoded in primary key,
foreign keys can enforce same-tenant relation,
safer for shared-schema systems,
query plans often start with tenant partition naturally.

Cons:

more verbose mapping,
more verbose repository signatures,
DTOs and URLs need both identifiers,
some frameworks expect simple id.

For high-risk shared-schema systems, composite tenant keys are often worth the friction.

A compromise is:

id uuid primary key,
tenant_id uuid not null,
unique (tenant_id, id)

Then all child FKs reference (tenant_id, id) instead of only id.

14. Repository Design: Tenant-Scoped by Construction

Bad:

public interface CaseRepository extends JpaRepository<CaseEntity, UUID> {
    Optional<CaseEntity> findById(UUID id);
}

This API invites tenant leaks.

Better:

public interface CaseRepository extends Repository<CaseEntity, CaseEntityId> {

    Optional<CaseEntity> findById(CaseEntityId id);

    @Query("""
        select c
        from CaseEntity c
        where c.id.tenantId = :tenantId
          and c.status = :status
        order by c.createdAt desc
    """)
    List<CaseEntity> findRecentByStatus(
        @Param("tenantId") UUID tenantId,
        @Param("status") CaseStatus status,
        Pageable pageable
    );
}

Even better for service use:

public final class TenantScopedCaseRepository {
    private final CaseRepository delegate;

    public Optional<CaseEntity> findById(UUID caseId) {
        UUID tenantId = TenantContextHolder.require().tenantId();
        return delegate.findById(new CaseEntityId(tenantId, caseId));
    }
}

The application service should not repeatedly extract tenant ID manually in every method if a safer abstraction can do it consistently.

15. Hibernate Multitenancy Concepts

Hibernate supports multitenancy at provider level for strategies such as separate database and separate schema. The core SPI concepts are:

Concept	Role
`CurrentTenantIdentifierResolver`	Resolves the current tenant identifier
`MultiTenantConnectionProvider`	Supplies connections for the target tenant
Tenant identifier	The runtime routing key used by Hibernate
Session/EntityManager	Bound to one tenant context for its lifecycle

Conceptual flow:

A simplified resolver:

public final class CurrentTenantResolver
    implements CurrentTenantIdentifierResolver<String> {

    @Override
    public String resolveCurrentTenantIdentifier() {
        return TenantContextHolder.require().tenantId().toString();
    }

    @Override
    public boolean validateExistingCurrentSessions() {
        return true;
    }
}

A provider for database-per-tenant:

public final class TenantDataSourceConnectionProvider
    implements MultiTenantConnectionProvider<String> {

    private final TenantDataSourceRegistry registry;

    public TenantDataSourceConnectionProvider(TenantDataSourceRegistry registry) {
        this.registry = registry;
    }

    @Override
    public Connection getConnection(String tenantIdentifier) throws SQLException {
        return registry.requireDataSource(tenantIdentifier).getConnection();
    }

    @Override
    public void releaseConnection(String tenantIdentifier, Connection connection)
        throws SQLException {
        connection.close();
    }

    @Override
    public Connection getAnyConnection() throws SQLException {
        return registry.defaultDataSource().getConnection();
    }

    @Override
    public void releaseAnyConnection(Connection connection) throws SQLException {
        connection.close();
    }

    @Override
    public boolean supportsAggressiveRelease() {
        return false;
    }

    @Override
    public boolean isUnwrappableAs(Class<?> unwrapType) {
        return unwrapType.isAssignableFrom(getClass());
    }

    @Override
    public <T> T unwrap(Class<T> unwrapType) {
        if (isUnwrappableAs(unwrapType)) {
            return unwrapType.cast(this);
        }
        throw new IllegalArgumentException("Unsupported unwrap type: " + unwrapType);
    }
}

The exact API shape can differ by Hibernate version, but the conceptual contract is stable: resolve tenant, acquire tenant-bound connection, ensure the session cannot silently cross tenants.

16. Spring `AbstractRoutingDataSource`

For database-per-tenant outside Hibernate's multitenancy SPI, Spring's AbstractRoutingDataSource is common.

public final class TenantRoutingDataSource extends AbstractRoutingDataSource {
    @Override
    protected Object determineCurrentLookupKey() {
        return TenantContextHolder.require().tenantId();
    }
}

This works when:

datasource map is known or dynamically managed,
tenant context is set before transaction begins,
connection acquisition happens after tenant context resolution,
no transaction reuses a connection from a previous tenant,
scheduled/message consumers set tenant context explicitly.

Critical rule:

Set tenant context before entering @Transactional, not inside it after the connection may already be acquired.

Bad:

@Transactional
public void handle(Request request) {
    TenantContextHolder.set(resolveTenant(request)); // too late if connection already acquired
    repository.save(...);
}

Better:

public void handle(Request request) {
    TenantContextHolder.set(resolveTenant(request));
    try {
        transactionalHandler.handle(request);
    } finally {
        TenantContextHolder.clear();
    }
}

Where transactionalHandler.handle() is the @Transactional boundary.

17. Row-Level Security as Defense in Depth

Some databases support row-level security policies. In shared-schema multitenancy, RLS can enforce tenant filtering in the database.

Conceptual example:

alter table cases enable row level security;

create policy tenant_isolation_on_cases
on cases
using (tenant_id = current_setting('app.tenant_id')::uuid);

Then the application sets tenant context in the database session:

set app.tenant_id = '...';

RLS is powerful, but it does not remove the need for application-level checks.

Risks:

connection pool session state must be reset,
migrations/admin jobs may bypass policy accidentally,
query plans can be harder to reason about,
local tests may not match production DB policy,
vendor portability decreases.

Good model:

Application authorization
+ tenant-scoped repository/query
+ database constraints/FKs
+ optional row-level security
= defense in depth

18. Tenant Leak Failure Modes

18.1 Missing Predicate

@Query("select c from CaseEntity c where c.status = :status")
List<CaseEntity> findByStatus(CaseStatus status);

Leak: returns all tenants.

Fix:

@Query("""
    select c
    from CaseEntity c
    where c.tenantId = :tenantId
      and c.status = :status
""")
List<CaseEntity> findByTenantAndStatus(UUID tenantId, CaseStatus status);

18.2 Cross-Tenant Association

case_event.tenant_id = tenant_a
case_event.case_id points to tenant_b case

Fix: composite FK including tenant ID.

18.3 Admin Endpoint Bypass

Admin tools often bypass normal user flows.

Bad:

@GetMapping("/admin/cases/{id}")
CaseDto getCase(@PathVariable UUID id) {
    return mapper.toDto(caseRepository.findById(id).orElseThrow());
}

Fix: admin still needs explicit scope.

@GetMapping("/admin/tenants/{tenantId}/cases/{caseId}")
CaseDto getCase(
    @PathVariable UUID tenantId,
    @PathVariable UUID caseId
) {
    return adminCaseService.getCase(tenantId, caseId);
}

18.4 Background Job Without Tenant Context

Bad:

@Scheduled(fixedDelay = 60_000)
void expireCases() {
    caseRepository.expireOldCases(); // Which tenant?
}

Better:

@Scheduled(fixedDelay = 60_000)
void expireCases() {
    for (Tenant tenant : tenantRegistry.activeTenants()) {
        TenantContextHolder.set(tenant.toContext());
        try {
            caseExpirationService.expireForCurrentTenant();
        } finally {
            TenantContextHolder.clear();
        }
    }
}

18.5 Caching Without Tenant Key

Bad cache key:

case:{caseId}

Better:

tenant:{tenantId}:case:{caseId}

If tenant is not part of the cache key, cache becomes a tenant leak vector.

19. Tenant-Aware Domain Events

Every tenant-owned event should carry tenant identity.

public record CaseAssignedEvent(
    UUID tenantId,
    UUID caseId,
    UUID assigneeId,
    Instant occurredAt
) {}

Outbox table:

create table outbox_events (
    tenant_id uuid not null,
    id uuid not null,
    aggregate_type varchar(100) not null,
    aggregate_id uuid not null,
    event_type varchar(200) not null,
    payload jsonb not null,
    status varchar(32) not null,
    created_at timestamptz not null,
    published_at timestamptz,
    primary key (tenant_id, id)
);

Message headers:

X-Tenant-Id: ...
X-Correlation-Id: ...
X-Event-Id: ...

Consumer rule:

A message consumer must resolve tenant context from trusted message metadata before touching tenant-owned persistence.

20. Tenant Migration Patterns

20.1 Shared Schema Migration

One schema migration affects all tenants.

Pros:

simple version tracking,
one migration run,
less operational fan-out.

Risk:

migration failure impacts every tenant,
long locks affect all tenants,
backfill can create global load spike.

20.2 Schema/Database Per Tenant Migration

Need metadata:

create table tenant_schema_version (
    tenant_id uuid primary key,
    current_version varchar(64) not null,
    last_migrated_at timestamptz not null,
    last_error text
);

Operational concerns:

run migrations in waves,
pause on error,
support retry,
track per-tenant version,
prevent app from routing to incompatible tenant schema,
support tenant-specific maintenance windows.

20.3 Expand-Contract for Tenanted Data

Same pattern as normal schema migration, but tenant-aware.

1. Expand: add nullable column/table/index
2. Deploy app that writes both old and new shape
3. Backfill per tenant, throttled
4. Validate per tenant
5. Switch reads
6. Contract old shape after all tenants are safe

For many tenants, backfill must be resumable:

create table tenant_backfill_progress (
    tenant_id uuid not null,
    job_name varchar(128) not null,
    last_processed_id uuid,
    status varchar(32) not null,
    updated_at timestamptz not null,
    primary key (tenant_id, job_name)
);

21. Data Partitioning Is Not Always Multitenancy

Partitioning is physical data distribution.

Tenancy is logical ownership/isolation.

They can align, but they do not have to.

Model	Meaning
Tenant partitioning	Partition key is tenant ID
Time partitioning	Partition key is date/time
Region partitioning	Partition key is geographic/data residency region
Hash partitioning	Partition key is hash of ID/tenant
Lifecycle partitioning	Active vs archived data separated

For tenant-heavy systems, common partition keys:

tenant_id
region + tenant_id
tenant_id + created_month
hash(tenant_id)

Mistake:

Partition by tenant_id when one tenant owns 80% of rows.

That creates a hot partition.

Better might be:

large tenant -> dedicated database/table partition
small tenants -> shared hash partitions

22. Noisy Neighbor Control

A noisy neighbor is one tenant consuming disproportionate shared resources.

Symptoms:

one tenant causes slow queries for others,
global table/index bloat,
lock contention,
cache eviction of other tenant data,
queue backlog dominated by one tenant,
connection pool exhaustion.

Controls:

Control	Example
Query budget	Max rows/time per tenant query
Rate limit	Tenant-specific request quotas
Worker fairness	Per-tenant queue partitioning
Connection limit	Dedicated pool or semaphore for large tenant
Cache partitioning	Per-tenant cache key/region sizing
Storage isolation	Move large tenant to dedicated DB
Backfill throttling	Process tenant data in small batches

A senior persistence design should include tenant-level SLOs:

tenant_id
p50/p95/p99 latency
slow query count
row count
index size
connection usage
lock wait time
queue lag
error rate

23. Tenant Deletion and Retention

Deleting a tenant is hard.

Questions:

Is deletion physical or logical?
What retention rules apply?
Is tenant data under legal hold?
Are audit records retained?
Are backups purged?
Are outbox/inbox records purged?
Are search indexes purged?
Are caches invalidated?
Are object storage files removed?
Are derived analytics records removed?

For shared schema, deletion should be orchestrated.

A tenant tombstone is useful:

create table deleted_tenants (
    tenant_id uuid primary key,
    deleted_at timestamptz not null,
    deletion_reason varchar(200),
    purge_batch_id uuid not null
);

This prevents accidental tenant ID reuse and helps audits.

24. Tenant-Aware Observability

Every persistence signal should be attributable by tenant, but carefully.

Good tags:

tenant_tier = enterprise|standard|trial
isolation_mode = shared|schema|database
region = eu|us|apac
operation = case.search

Dangerous high-cardinality tag:

tenant_id = every tenant UUID

High-cardinality labels can break metrics systems. Use tenant ID selectively in logs/traces, not always in metrics.

Recommended approach:

Signal	Tenant Tagging Approach
Logs	Include tenant ID, correlation ID, request ID
Traces	Include tenant ID when sampling/secure storage allows
Metrics	Prefer tier/region/isolation mode; tenant ID only for top-N or controlled systems
Audit logs	Include tenant ID always
Security events	Include tenant ID always

25. Testing Tenant Isolation

25.1 Basic Isolation Test

@Test
void tenantCannotReadOtherTenantCase() {
    UUID tenantA = UUID.randomUUID();
    UUID tenantB = UUID.randomUUID();

    UUID caseId = createCaseForTenant(tenantA);

    runAsTenant(tenantB, () -> {
        Optional<CaseDto> result = caseService.findCase(caseId);
        assertThat(result).isEmpty();
    });
}

25.2 Query Mutation Test

The test intentionally creates same business identifier in two tenants.

@Test
void caseNumberUniquenessIsTenantScoped() {
    createCase(tenantA, "CASE-001");
    createCase(tenantB, "CASE-001");

    runAsTenant(tenantA, () -> {
        CaseDto result = caseService.findByCaseNumber("CASE-001");
        assertThat(result.tenantId()).isEqualTo(tenantA);
    });
}

25.3 Cache Isolation Test

@Test
void cacheKeyMustIncludeTenant() {
    UUID sharedCaseId = UUID.randomUUID();

    createCase(tenantA, sharedCaseId, "A data");
    createCase(tenantB, sharedCaseId, "B data");

    runAsTenant(tenantA, () -> assertThat(service.get(sharedCaseId).title()).isEqualTo("A data"));
    runAsTenant(tenantB, () -> assertThat(service.get(sharedCaseId).title()).isEqualTo("B data"));
}

25.4 Background Job Test

@Test
void scheduledJobProcessesEachTenantWithinOwnContext() {
    createExpiredCase(tenantA);
    createExpiredCase(tenantB);

    expirationJob.runOnce();

    runAsTenant(tenantA, () -> assertExpiredCount(1));
    runAsTenant(tenantB, () -> assertExpiredCount(1));
}

25.5 Negative Repository Test

Search codebase for unsafe repository methods:

findById(UUID id)
findAll()
deleteAll()
@Query without tenant predicate
native query without tenant predicate
cacheable method without tenant in key

This can be enforced with ArchUnit or custom static checks.

26. Common Anti-Patterns

26.1 Tenant ID Only in Controller

Bad:

@GetMapping("/cases/{id}")
CaseDto get(@RequestHeader("X-Tenant-Id") UUID tenantId, @PathVariable UUID id) {
    return service.get(tenantId, id);
}

The controller is not enough. Tenant must survive service/repository/background boundaries.

26.2 Global `findById`

Bad:

repository.findById(id)

This is one of the most common tenant leak APIs.

26.3 Tenant-Agnostic Cache

Bad:

@Cacheable(cacheNames = "case", key = "#caseId")
CaseDto get(UUID caseId) { ... }

Better:

@Cacheable(
    cacheNames = "case",
    key = "T(com.acme.TenantContextHolder).require().tenantId() + ':' + #caseId"
)
CaseDto get(UUID caseId) { ... }

26.4 Tenant Context Not Cleared

Bad:

TenantContextHolder.set(context);
filterChain.doFilter(request, response);

If exception occurs, context leaks to reused thread.

Fix: always clear in finally.

26.5 Shared Admin Superuser Without Explicit Scope

A global admin must still choose scope.

Bad:

admin can query all tenant rows by default

Better:

admin must choose tenant, reason, ticket/reference, and access is audited

26.6 Cross-Tenant Reporting From OLTP Tables

Bad:

select tenant_id, count(*) from huge_table group by tenant_id;

This can harm online traffic.

Better:

replica / warehouse / materialized rollup / async metrics pipeline

27. Practical Design Checklist

Before approving multitenant persistence design, answer these:

Tenant Model

What exactly is a tenant?
Can one user belong to multiple tenants?
Can one tenant belong to multiple regions?
Can tenant ownership change?
Can tenants merge/split?

Isolation Strategy

Why this isolation model?
What is the accepted blast radius?
How many tenants are expected in 1, 3, 5 years?
What is the largest expected tenant size?
What is the backup/restore requirement?

Query Safety

Can any repository query omit tenant scope?
Are native queries reviewed for tenant predicates?
Are unique constraints tenant-aware?
Are foreign keys tenant-aware?
Are cache keys tenant-aware?

Operations

Can one tenant be suspended?
Can one tenant be migrated?
Can one tenant be restored?
Can one tenant be moved to dedicated storage?
Can one tenant be purged?

Observability

Can we identify noisy tenants?
Can we detect tenant leak attempts?
Can we audit admin cross-tenant access?
Can we correlate slow queries by tenant tier/isolation mode?

28. Practice Drill

Design a multitenant case management module.

Requirements:

- Many small tenants.
- Some large enterprise tenants.
- Users can belong to multiple tenants.
- Cases have tenant-local case numbers.
- Case comments, attachments, and audit records must never cross tenants.
- Enterprise tenants may require dedicated database later.
- Tenant deletion must support legal hold.
- Background jobs expire stale cases per tenant.

Deliverables:

Choose initial isolation strategy.
Define tenant registry model.
Define table constraints for cases, case_comments, case_assignments.
Define repository method rules.
Define cache key pattern.
Define migration/backfill strategy.
Define tests that prove no tenant leak.
Define escape hatch for moving one tenant to dedicated database.

Good answer shape:

Shared schema for standard tenants, hybrid-ready tenant registry.
Tenant-aware composite unique constraints and FKs.
Tenant context resolved after authentication and before transaction.
Repositories do not expose tenant-agnostic findById.
Outbox/inbox records include tenant_id.
Large tenants can be rehomed by registry route + data copy + cutover.

29. Key Takeaways

Multitenancy is a data isolation model, not just a tenant_id column.
Tenant context must be derived from trusted authentication and authorization state.
Shared-schema tenancy is cheap but leak-prone; enforce tenant boundary in queries, FKs, unique constraints, and cache keys.
Database-per-tenant gives strong isolation but increases migration and operational complexity.
Schema-per-tenant sits between both extremes but requires careful connection/session state control.
Hybrid tenancy is common in mature SaaS systems.
Tenant-aware persistence must cover HTTP, async, messaging, scheduler, admin, migration, and tests.
The safest design prevents tenant leaks by construction rather than relying on developer memory.

30. Bridge to Next Part

Multitenancy defines where tenant-owned data lives and how access is scoped.

The next problem is distributed consistency:

What happens when a transaction updates the database and must also publish an event, call another service, or trigger downstream processing?

That is the dual-write problem.

Part 030 covers:

transactional outbox,
inbox deduplication,
idempotent consumers,
event relay,
polling vs CDC,
failure recovery,
ordering,
poison messages,
distributed persistence design.

Lesson Recap

You just completed lesson 29 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 28

Database-Specific Types and Custom Mapping

Next Lesson

Lesson 30

Outbox, Inbox, and Distributed Persistence

Multitenancy and Data Partitioning

Part 029 — Multitenancy and Data Partitioning

1. Kaufman Deconstruction: What Skill Are We Practicing?

2. Core Mental Model

3. Tenancy Is Not Always Customer Tenancy

4. Multitenancy Strategy Matrix

5. Strategy 1: Database Per Tenant

5.1 Advantages

5.2 Disadvantages

5.3 When It Makes Sense

6. Strategy 2: Schema Per Tenant

6.1 Advantages

6.2 Disadvantages

6.3 Failure Mode: Leaked Schema State

7. Strategy 3: Shared Schema with Tenant Discriminator

7.1 Advantages

7.2 Disadvantages

8. Strategy 4: Hybrid Tenancy

9. Tenant Context Propagation

10. Tenant Context Is Not Authorization

11. Shared-Schema Modelling Rules

11.1 Tenant-Owned Tables Must Have tenant_id

11.2 Foreign Keys Should Include Tenant Boundary

11.3 Unique Constraints Must Be Tenant-Aware

11.4 Indexes Must Match Tenant-Scoped Queries

12. JPA Mapping for Shared-Schema Tenancy

13. Surrogate ID vs Composite Tenant Key

13.1 Global Surrogate ID + Tenant Column

13.2 Composite Key: (tenant_id, id)

14. Repository Design: Tenant-Scoped by Construction

15. Hibernate Multitenancy Concepts

16. Spring AbstractRoutingDataSource

17. Row-Level Security as Defense in Depth

18. Tenant Leak Failure Modes

18.1 Missing Predicate

18.2 Cross-Tenant Association

18.3 Admin Endpoint Bypass

18.4 Background Job Without Tenant Context

18.5 Caching Without Tenant Key

19. Tenant-Aware Domain Events

20. Tenant Migration Patterns

20.1 Shared Schema Migration

20.2 Schema/Database Per Tenant Migration

20.3 Expand-Contract for Tenanted Data

21. Data Partitioning Is Not Always Multitenancy

22. Noisy Neighbor Control

23. Tenant Deletion and Retention

24. Tenant-Aware Observability

25. Testing Tenant Isolation

25.1 Basic Isolation Test

25.2 Query Mutation Test

25.3 Cache Isolation Test

25.4 Background Job Test

25.5 Negative Repository Test

26. Common Anti-Patterns

26.1 Tenant ID Only in Controller

26.2 Global findById

26.3 Tenant-Agnostic Cache

26.4 Tenant Context Not Cleared

26.5 Shared Admin Superuser Without Explicit Scope

26.6 Cross-Tenant Reporting From OLTP Tables

27. Practical Design Checklist

Tenant Model

Isolation Strategy

Query Safety

Operations

Observability

28. Practice Drill

29. Key Takeaways

30. Bridge to Next Part

11.1 Tenant-Owned Tables Must Have `tenant_id`

13.2 Composite Key: `(tenant_id, id)`

16. Spring `AbstractRoutingDataSource`

26.2 Global `findById`