Series MapLesson 19 / 35
Build CoreOrdered learning track

Learn Java Identity Authentication Authorization Api Platform Part 019 Multi Tenancy Identity Isolation

20 min read3828 words
PrevNext
Lesson 1935 lesson track0719 Build Core

title: Learn Java Identity, Authentication & Authorization for Secure Enterprise API Platform - Part 019 description: Multi-tenant identity and authorization isolation for Java enterprise API platforms, including tenant resolution, issuer strategy, tenant-aware principals, data boundary enforcement, cache/event/job isolation, admin tenant switching, and tenant escape testing. series: learn-java-identity-authentication-authorization-api-platform seriesTitle: Learn Java Identity, Authentication & Authorization for Secure Enterprise API Platform order: 19 partTitle: Multi-Tenant Identity and Authorization Isolation tags:

  • java
  • identity
  • authorization
  • multi-tenancy
  • oauth2
  • oidc
  • spring-security
  • api-security
  • enterprise-platform
  • tenant-isolation date: 2026-06-28

Part 019 — Multi-Tenant Identity and Authorization Isolation

1. Problem Framing

Multi-tenancy is not just a database concern.

In an enterprise API platform, tenant isolation cuts across:

  • login and federation;
  • token issuer selection;
  • token validation;
  • user-to-tenant membership;
  • role and entitlement semantics;
  • authorization decisions;
  • data access predicates;
  • caches;
  • message events;
  • background jobs;
  • audit logs;
  • support/admin workflows;
  • incident response.

A platform is multi-tenant when one runtime, control plane, API, identity system, or data layer serves more than one customer, organization, workspace, business unit, realm, region, or legal partition.

The dangerous assumption is this:

"The user is authenticated, therefore the tenant is implied."

That is rarely safe.

A user can belong to multiple tenants. A client can be registered for multiple tenants. A service can process work for many tenants. A token can be valid but not valid for the tenant requested by the API call. A resource ID can belong to a different tenant than the route/header/context suggests. A cache key can silently merge tenant data. A background job can run without user context but still require tenant context.

The hard problem is not "how do we store tenant_id?"

The hard problem is:

How do we prove that every identity, token, request, resource, policy decision,
query, event, cache entry, and audit record is bound to the correct tenant boundary?

This part builds that proof model.

2. Kaufman Skill Slice

In Kaufman's framework, the effective skill is not "know multi-tenancy patterns".

The target performance is:

Given an enterprise API platform, you can identify every tenant boundary,
define a deterministic tenant resolution strategy, enforce tenant-aware authorization,
prevent cross-tenant reads/writes/events/caches, and design tests that catch tenant escape.

Break the skill into subskills:

SubskillWhat You Must Be Able To Do
Vocabulary controlDistinguish tenant, organization, account, realm, issuer, membership, workspace, and resource owner.
Tenant resolutionDetermine the tenant from trustworthy inputs without accepting attacker-controlled ambiguity.
Token validationValidate that issuer, audience, subject, client, tenant claim, and key material belong together.
Principal modellingRepresent the authenticated subject plus active tenant plus memberships without flattening them into roles.
AuthorizationEvaluate tenant membership and resource tenant ownership as first-class policy inputs.
Data boundaryApply tenant predicates before data enters application result sets.
PropagationCarry tenant context across HTTP, messaging, jobs, caches, logs, and tracing.
OperationsOnboard/offboard tenants safely, rotate keys, disable compromised tenants, and audit cross-tenant admin actions.
TestingBuild negative tests where authenticated callers intentionally try another tenant's resources.

The fluency marker:

You stop asking, "Where do I put tenant_id?"
and start asking, "What tenant assertion is trusted at this boundary, and how is it verified?"

3. Core Vocabulary

Multi-tenant systems become unsafe when teams reuse terms loosely.

Use precise vocabulary.

TermMeaningCommon Mistake
TenantSecurity/isolation boundary for a customer, organization, workspace, realm, or legal partition.Treating tenant as only a display organization.
OrganizationBusiness grouping that may or may not equal tenant.Assuming every organization is a hard isolation boundary.
WorkspaceProduct/application grouping inside a tenant.Using workspace as tenant without verifying ownership chain.
AccountLogin/account record for a human or workload.Treating account as globally authorized for all tenants.
SubjectThe authenticated actor identifier, usually from token sub.Assuming sub alone identifies tenant-specific privileges.
PrincipalApplication representation of the authenticated actor.Cramming tenant, roles, permissions, and resource IDs into one string.
MembershipRelationship between subject and tenant.Encoding current membership only in long-lived JWT claims.
Active tenantTenant selected/resolved for current request/session.Letting client switch tenant by header without membership check.
Resource tenantTenant that owns the target resource.Trusting route tenant instead of loading/checking resource tenant.
IssuerAuthorization server identity that issued token.Accepting tokens from an issuer not trusted for requested tenant.
RealmIdentity namespace; common in Keycloak-like systems.Assuming realm always equals tenant.
ClientOAuth client/application identity.Allowing one client registration to operate across tenants unintentionally.
Service accountNon-human workload identity.Treating service account as globally omnipotent.
EntitlementConcrete access grant in a tenant/domain.Using generic role names without tenant/resource scope.

Important invariant:

Authentication establishes who/what the caller is.
Tenant resolution establishes which isolation boundary the request claims to operate in.
Authorization proves that the caller may perform the requested action inside that tenant boundary.

These are three related but separate steps.

4. Mental Model: Tenant Is a Boundary, Not a Filter

A weak multi-tenant design treats tenant as a UI filter.

GET /cases?tenantId=tenant-a

Then application code filters by tenantId.

A strong design treats tenant as a security boundary.

request tenant assertion
        ↓
trusted tenant resolution
        ↓
token/issuer/client validation against tenant
        ↓
active tenant context
        ↓
resource tenant check
        ↓
query predicate / data partition
        ↓
audit record includes tenant decision evidence

The tenant must be part of the platform's security context, not just a request parameter.

The Tenant Isolation Equation

For any protected operation:

permit = authenticated
      AND token_trusted_for_tenant
      AND subject_member_of_tenant
      AND client_allowed_for_tenant
      AND action_allowed_in_tenant
      AND resource_belongs_to_tenant
      AND data_access_constrained_to_tenant

Leaving out any term creates a tenant escape path.

5. Common Tenant Topologies

There is no universal best topology.

The right model depends on isolation, compliance, operational cost, customer customization, and identity federation needs.

5.1 Shared Issuer, Tenant Claim

One authorization server issues tokens for all tenants.

Token contains tenant-related claims:

{
  "iss": "https://idp.example.com",
  "sub": "user-123",
  "aud": "case-api",
  "tenant_id": "tenant-a",
  "scope": "case:read case:write",
  "roles": ["case-manager"]
}

Benefits:

  • operationally simpler;
  • one JWKS endpoint;
  • easier single sign-on across tenants;
  • common token validation path.

Risks:

  • stale tenant membership in tokens;
  • accidental cross-tenant claims;
  • tenant-specific policy hidden inside global issuer;
  • one issuer compromise can affect all tenants;
  • issuer cannot be used alone to determine tenant.

Use when:

  • tenants share the same trust domain;
  • tenant-specific customization is moderate;
  • platform can enforce membership and resource tenant in application/policy/data layers.

5.2 Issuer Per Tenant

Each tenant has its own issuer or realm.

Example:

https://idp.example.com/realms/tenant-a
https://idp.example.com/realms/tenant-b

Benefits:

  • stronger identity namespace separation;
  • tenant-specific federation settings;
  • tenant-specific keys/lifetimes/clients;
  • easier tenant offboarding or isolation during incidents.

Risks:

  • runtime must select correct issuer dynamically;
  • JWKS caching complexity;
  • higher operational overhead;
  • tenant discovery becomes security-sensitive;
  • issuer spoofing and lazy trust expansion are dangerous.

Use when:

  • customers require dedicated federation;
  • regulatory or contractual isolation is high;
  • tenant-specific identity policy is material.

5.3 Dedicated Authorization Server Per Tenant

Each tenant has dedicated IdP/AS infrastructure.

Benefits:

  • high isolation;
  • blast-radius reduction;
  • independent lifecycle and key management;
  • customer-controlled identity integration.

Risks:

  • high cost;
  • operational complexity;
  • more failure modes during onboarding;
  • more configuration drift.

Use when:

  • high-value regulated customers require strict isolation;
  • tenant count is manageable;
  • per-tenant compliance controls are part of the contract.

5.4 Hybrid: Platform Issuer + External Federated IdPs

The platform has its own issuer but federates users from tenant IdPs.

Benefits:

  • API trusts one platform issuer;
  • external IdPs are normalized;
  • tenant federation complexity stays in identity layer;
  • resource servers avoid trusting every enterprise IdP directly.

Risks:

  • account linking mistakes;
  • federated attributes over-trusted;
  • tenant login discovery phishing;
  • lifecycle drift between enterprise IdP and platform membership.

This is common for enterprise SaaS.

6. Decision Matrix

DimensionShared IssuerIssuer Per TenantDedicated AS Per TenantHybrid Federation
Operational simplicityHighMediumLowMedium
Isolation strengthMediumHighVery highMedium-high
Customer federation flexibilityMediumHighVery highHigh
Resource server complexityLow-mediumHighHighMedium
Key rotation complexityLowMedium-highHighMedium
Tenant offboardingMediumHighHighMedium
Blast radiusHighMediumLowMedium
Best fitMost SaaS platformsEnterprise/regulated SaaSHigh-assurance dedicated tenantsSaaS with enterprise SSO

Practical recommendation:

Default to a shared platform issuer or hybrid federation unless tenant isolation requirements justify issuer-per-tenant complexity.
But regardless of issuer strategy, never rely on issuer alone as the full authorization decision.

7. Tenant Resolution

Tenant resolution answers:

Which tenant is this request attempting to operate in?

It does not yet answer:

Is the caller allowed to operate in that tenant?

7.1 Tenant Resolution Sources

Common sources:

SourceExampleTrust LevelRisk
Hostnametenant-a.api.example.comMedium-highDNS/domain takeover, alias misbinding.
Path/tenants/{tenantId}/casesMediumCaller can manipulate path.
Token claimtenant_id claimMedium-high if issuer trustedStale/incorrect claim, multi-tenant token ambiguity.
Session active tenantServer-side session attributeHigh if server-controlledStale active tenant after membership removal.
Client registrationOAuth client tied to tenantHigh for M2MClient may be shared/misconfigured.
HeaderX-Tenant-IdLow unless gateway-signedSpoofable by clients.
Body field{ "tenantId": "..." }LowEasy tampering.
Resource ownerloaded object's tenant_idHigh for resource authorizationRequires safe lookup; not enough for list/create.

Do not confuse "available" with "trusted".

A request header is easy to read, but not inherently trustworthy.

7.2 Resolution Strategy Pattern

Define an explicit strategy.

Example order:

  1. resolve tenant from trusted route/host/session/token source;
  2. reject conflicting tenant assertions;
  3. validate tenant exists and is active;
  4. validate issuer/client is allowed for tenant;
  5. validate subject/service membership;
  6. attach active tenant to security context;
  7. enforce resource tenant and data predicates downstream.

Conflict handling must be strict.

If path says tenant-a and token says tenant-b, reject.
Do not silently choose one.

7.3 Tenant Resolution Invariants

Invariant 1: Active tenant must be explicit for tenant-scoped operations.
Invariant 2: Active tenant must come from a configured resolution strategy.
Invariant 3: Conflicting tenant assertions must fail closed.
Invariant 4: Tenant context must not be inferred only from a user-controlled object ID.
Invariant 5: Tenant context must not be stored only in ThreadLocal without cleanup.
Invariant 6: Tenant context must be included in authorization, data access, cache, events, logs, and traces.

8. Java Tenant Context Model

Avoid passing raw strings everywhere.

Use a small domain type.

public record TenantId(String value) {
    public TenantId {
        if (value == null || value.isBlank()) {
            throw new IllegalArgumentException("tenant id is required");
        }
        if (!value.matches("[a-zA-Z0-9._-]{3,100}")) {
            throw new IllegalArgumentException("invalid tenant id format");
        }
    }
}

Then model active tenant context separately from authenticated subject.

public record TenantContext(
        TenantId activeTenant,
        TenantResolutionSource source,
        boolean tenantSwitch,
        String correlationId
) {
}
public enum TenantResolutionSource {
    HOST,
    PATH,
    TOKEN_CLAIM,
    SERVER_SESSION,
    CLIENT_REGISTRATION,
    GATEWAY_SIGNED_HEADER
}

Do not model it as:

String tenant;

The extra fields help with audit and debugging.

9. Tenant-Aware Principal

A safe principal distinguishes:

  • global subject identity;
  • active tenant;
  • memberships;
  • authorities for the active tenant;
  • authentication assurance;
  • client identity;
  • issuer.

Example:

public record TenantAwarePrincipal(
        String subject,
        String issuer,
        String clientId,
        TenantId activeTenant,
        Set<TenantMembership> memberships,
        Set<String> activeTenantAuthorities,
        AssuranceLevel assuranceLevel
) {
    public boolean memberOf(TenantId tenantId) {
        return memberships.stream().anyMatch(m -> m.tenantId().equals(tenantId) && m.active());
    }
}
public record TenantMembership(
        TenantId tenantId,
        String accountId,
        Set<String> roles,
        Set<String> entitlements,
        boolean active
) {
}

Important design choice:

A role without tenant scope is rarely safe in a multi-tenant product.

Prefer:

tenant-a:case-manager
tenant-b:viewer

over:

ROLE_CASE_MANAGER

But avoid encoding everything into stringly-typed role names when policy logic becomes complex.

10. Tenant Claims in Tokens

Tenant claims are useful but dangerous when treated as complete authorization.

10.1 Single Active Tenant Token

{
  "iss": "https://idp.example.com",
  "sub": "user-123",
  "aud": "case-api",
  "tenant_id": "tenant-a",
  "scope": "case:read case:write",
  "exp": 1782600000
}

Good for:

  • clear token-to-tenant binding;
  • simpler resource server checks;
  • easy audit.

Risk:

  • user must get new token to switch tenant;
  • membership changes may stay stale until token expiry;
  • token issued for wrong tenant is high impact.

10.2 Multi-Tenant Membership Token

{
  "iss": "https://idp.example.com",
  "sub": "user-123",
  "aud": "case-api",
  "tenants": [
    { "id": "tenant-a", "roles": ["case-manager"] },
    { "id": "tenant-b", "roles": ["viewer"] }
  ]
}

Good for:

  • lower token switching friction;
  • fewer round trips.

Risk:

  • token bloat;
  • stale memberships;
  • accidental privilege leakage;
  • harder cache invalidation;
  • role semantics leak into token.

10.3 No Tenant Claim, Lookup Membership at Runtime

Good for:

  • revocation responsiveness;
  • dynamic authorization;
  • small tokens.

Risk:

  • dependency on membership service;
  • performance/caching complexity;
  • more failure modes.

Practical recommendation:

For high-value enterprise APIs, prefer short-lived access tokens with explicit active tenant,
then validate membership/resource ownership at runtime for sensitive operations.

11. Spring Security: Multi-Tenant Resource Server

Spring Security's resource server can be configured for multi-tenancy by resolving which authentication manager/JWT decoder should validate a bearer token for a request.

The core idea:

request -> resolve tenant/issuer -> choose JwtDecoder/AuthenticationManager -> validate token -> build Authentication

11.1 Servlet AuthenticationManagerResolver Sketch

@Configuration
@EnableWebSecurity
class SecurityConfig {

    @Bean
    SecurityFilterChain apiSecurity(HttpSecurity http,
                                    AuthenticationManagerResolver<HttpServletRequest> resolver) throws Exception {
        return http
                .csrf(csrf -> csrf.disable())
                .authorizeHttpRequests(auth -> auth
                        .requestMatchers("/actuator/health").permitAll()
                        .anyRequest().authenticated())
                .oauth2ResourceServer(oauth2 -> oauth2
                        .authenticationManagerResolver(resolver))
                .build();
    }
}

Resolver:

@Component
final class TenantAuthenticationManagerResolver
        implements AuthenticationManagerResolver<HttpServletRequest> {

    private final TenantResolver tenantResolver;
    private final TenantIssuerRegistry issuerRegistry;
    private final ConcurrentMap<TenantId, AuthenticationManager> managers = new ConcurrentHashMap<>();

    TenantAuthenticationManagerResolver(TenantResolver tenantResolver,
                                        TenantIssuerRegistry issuerRegistry) {
        this.tenantResolver = tenantResolver;
        this.issuerRegistry = issuerRegistry;
    }

    @Override
    public AuthenticationManager resolve(HttpServletRequest request) {
        TenantId tenantId = tenantResolver.resolve(request)
                .orElseThrow(() -> new BadCredentialsException("Tenant is required"));

        TenantIssuer issuer = issuerRegistry.findActiveIssuerForTenant(tenantId)
                .orElseThrow(() -> new BadCredentialsException("Unknown tenant issuer"));

        return managers.computeIfAbsent(tenantId, ignored -> managerFor(issuer));
    }

    private AuthenticationManager managerFor(TenantIssuer issuer) {
        NimbusJwtDecoder decoder = NimbusJwtDecoder
                .withIssuerLocation(issuer.issuerUri())
                .build();

        OAuth2TokenValidator<Jwt> validator = new DelegatingOAuth2TokenValidator<>(
                JwtValidators.createDefaultWithIssuer(issuer.issuerUri()),
                new AudienceValidator(issuer.allowedAudiences()),
                new TenantClaimValidator(issuer.tenantId())
        );

        decoder.setJwtValidator(validator);

        JwtAuthenticationProvider provider = new JwtAuthenticationProvider(decoder);
        provider.setJwtAuthenticationConverter(new TenantJwtAuthenticationConverter());
        return provider::authenticate;
    }
}

This is a sketch, not a drop-in implementation.

Production concerns:

  • do not allow arbitrary issuer URLs from request input;
  • registry must be allowlisted;
  • cache must support tenant disable and key rotation;
  • issuer discovery should have timeouts;
  • metrics must expose validation failures by reason without leaking sensitive data;
  • unknown tenants must fail closed.

11.2 Tenant Resolver

public interface TenantResolver {
    Optional<TenantId> resolve(HttpServletRequest request);
}

Example strict resolver:

@Component
final class StrictTenantResolver implements TenantResolver {

    @Override
    public Optional<TenantId> resolve(HttpServletRequest request) {
        Optional<TenantId> pathTenant = fromPath(request);
        Optional<TenantId> hostTenant = fromHost(request);
        Optional<TenantId> signedHeaderTenant = fromSignedGatewayHeader(request);

        Set<TenantId> candidates = Stream.of(pathTenant, hostTenant, signedHeaderTenant)
                .flatMap(Optional::stream)
                .collect(Collectors.toSet());

        if (candidates.size() > 1) {
            throw new BadCredentialsException("Conflicting tenant assertions");
        }

        return candidates.stream().findFirst();
    }

    private Optional<TenantId> fromPath(HttpServletRequest request) {
        // Example only: use routing metadata rather than fragile string parsing in real systems.
        String uri = request.getRequestURI();
        Matcher matcher = Pattern.compile("^/tenants/([^/]+)/.*").matcher(uri);
        return matcher.matches() ? Optional.of(new TenantId(matcher.group(1))) : Optional.empty();
    }

    private Optional<TenantId> fromHost(HttpServletRequest request) {
        String host = request.getServerName();
        if (host.endsWith(".api.example.com")) {
            String subdomain = host.substring(0, host.length() - ".api.example.com".length());
            return Optional.of(new TenantId(subdomain));
        }
        return Optional.empty();
    }

    private Optional<TenantId> fromSignedGatewayHeader(HttpServletRequest request) {
        String tenant = request.getHeader("X-Verified-Tenant");
        String signature = request.getHeader("X-Verified-Tenant-Signature");

        if (tenant == null) {
            return Optional.empty();
        }
        if (!verifyGatewaySignature(tenant, signature)) {
            throw new BadCredentialsException("Invalid tenant header signature");
        }
        return Optional.of(new TenantId(tenant));
    }

    private boolean verifyGatewaySignature(String tenant, String signature) {
        // Example only. Use HMAC or mTLS-authenticated gateway metadata in real systems.
        return signature != null && !signature.isBlank();
    }
}

Important:

If the public internet can set X-Tenant-Id, it is not a trusted tenant assertion.

A gateway-injected tenant header is only useful if the application can prove it came from the gateway.

12. Tenant Issuer Registry

A resource server must not discover issuers dynamically from attacker input.

Bad:

String issuer = request.getHeader("X-Issuer");
JwtDecoder decoder = JwtDecoders.fromIssuerLocation(issuer);

This can create trust-on-first-use behavior.

Better:

public interface TenantIssuerRegistry {
    Optional<TenantIssuer> findActiveIssuerForTenant(TenantId tenantId);
}
public record TenantIssuer(
        TenantId tenantId,
        String issuerUri,
        Set<String> allowedAudiences,
        Set<String> allowedClientIds,
        boolean active
) {
}

Registry invariants:

- tenant must exist;
- tenant must be active;
- issuer URI must be allowlisted;
- allowed audiences must be explicit;
- allowed clients must be explicit for sensitive APIs;
- disabled tenant must stop accepting tokens quickly;
- registry changes must be audited.

13. Tenant Claim Validation

A JWT can be cryptographically valid but semantically invalid.

Example custom validator:

final class TenantClaimValidator implements OAuth2TokenValidator<Jwt> {

    private final TenantId expectedTenant;

    TenantClaimValidator(TenantId expectedTenant) {
        this.expectedTenant = expectedTenant;
    }

    @Override
    public OAuth2TokenValidatorResult validate(Jwt token) {
        String claim = token.getClaimAsString("tenant_id");

        if (claim == null) {
            OAuth2Error error = new OAuth2Error(
                    "invalid_token",
                    "Missing tenant_id claim",
                    null
            );
            return OAuth2TokenValidatorResult.failure(error);
        }

        if (!expectedTenant.equals(new TenantId(claim))) {
            OAuth2Error error = new OAuth2Error(
                    "invalid_token",
                    "Token tenant does not match request tenant",
                    null
            );
            return OAuth2TokenValidatorResult.failure(error);
        }

        return OAuth2TokenValidatorResult.success();
    }
}

Do not log raw tokens.

Do not return detailed validation internals to clients.

14. Mapping Tenant Authorities

Avoid granting global authorities from tenant-local claims.

Bad:

return new SimpleGrantedAuthority("ROLE_ADMIN");

Better:

return new SimpleGrantedAuthority("TENANT_" + tenantId.value() + "_ROLE_ADMIN");

But string authority explosion becomes hard to manage.

For serious systems, keep authorities as coarse technical signals and put domain authorization in policy services.

final class TenantJwtAuthenticationConverter implements Converter<Jwt, AbstractAuthenticationToken> {

    @Override
    public AbstractAuthenticationToken convert(Jwt jwt) {
        TenantId tenantId = new TenantId(jwt.getClaimAsString("tenant_id"));
        String subject = jwt.getSubject();
        String issuer = jwt.getIssuer().toString();
        String clientId = jwt.getClaimAsString("client_id");

        Set<String> roles = Optional.ofNullable(jwt.getClaimAsStringList("roles"))
                .orElse(List.of())
                .stream()
                .collect(Collectors.toUnmodifiableSet());

        TenantAwarePrincipal principal = new TenantAwarePrincipal(
                subject,
                issuer,
                clientId,
                tenantId,
                Set.of(new TenantMembership(tenantId, subject, roles, Set.of(), true)),
                roles,
                AssuranceLevel.UNKNOWN
        );

        Collection<GrantedAuthority> authorities = roles.stream()
                .map(role -> new SimpleGrantedAuthority("TENANT_ROLE_" + role))
                .map(GrantedAuthority.class::cast)
                .toList();

        return new JwtAuthenticationToken(jwt, authorities, subject);
    }
}

In real code, attach the TenantAwarePrincipal through a custom authentication token or principal wrapper.

15. Active Tenant vs Resource Tenant

Resolving active tenant is not enough.

Example:

GET /tenants/tenant-a/cases/case-999

The route says tenant-a.

But case-999 may belong to tenant-b.

Therefore resource-level checks must compare:

active tenant == resource tenant

Example:

public CaseView getCase(TenantAwarePrincipal principal, TenantId tenantId, UUID caseId) {
    authorization.requireTenantMember(principal, tenantId);

    CaseEntity caze = caseRepository.findVisibleByTenantAndId(tenantId, caseId)
            .orElseThrow(NotFoundException::new);

    authorization.requireCanReadCase(principal, caze);

    return CaseView.from(caze);
}

The repository query already includes tenant predicate:

@Query("""
       select c
       from CaseEntity c
       where c.tenantId = :tenantId
         and c.id = :caseId
       """)
Optional<CaseEntity> findVisibleByTenantAndId(TenantId tenantId, UUID caseId);

This avoids loading cross-tenant data.

16. Create Operations

Create endpoints are often missed.

POST /tenants/tenant-a/cases

The client should not be allowed to set arbitrary tenantId in the body.

Bad:

CaseEntity entity = new CaseEntity(request.tenantId(), request.title());

Better:

CaseEntity entity = CaseEntity.open(
        tenantIdFromRouteAndSecurityContext,
        request.title(),
        principal.subject()
);

Invariant:

For tenant-scoped create operations, tenant comes from verified security/routing context, not from request body.

If the body contains tenantId, treat it as redundant and validate equality.

17. Update and Delete Operations

For mutation:

  1. resolve active tenant;
  2. load resource by active tenant + ID;
  3. authorize action using current state;
  4. apply mutation;
  5. preserve tenant ID invariant;
  6. audit old/new sensitive state.
@Transactional
public void assignCase(TenantAwarePrincipal principal,
                       TenantId tenantId,
                       UUID caseId,
                       String assigneeAccountId) {
    CaseEntity caze = caseRepository.findByTenantIdAndIdForUpdate(tenantId, caseId)
            .orElseThrow(NotFoundException::new);

    decisionService.require(principal, "case.assign", caze);

    if (!membershipRepository.isActiveMember(tenantId, assigneeAccountId)) {
        throw new DomainRuleViolation("Assignee is not active in tenant");
    }

    caze.assignTo(assigneeAccountId);

    audit.log(AuditEvent.caseAssigned(
            tenantId,
            caseId,
            principal.subject(),
            assigneeAccountId
    ));
}

Do not let mutation handlers change tenantId except controlled tenant migration workflows.

18. Tenant-Safe Data Access Patterns

18.1 Composite Query Pattern

Optional<Invoice> findByTenantIdAndInvoiceId(TenantId tenantId, UUID invoiceId);

Avoid:

Optional<Invoice> findByInvoiceId(UUID invoiceId);

for tenant-scoped objects.

18.2 Composite Unique Constraint

alter table cases
add constraint uq_cases_tenant_external_ref unique (tenant_id, external_ref);

Global uniqueness may leak existence across tenants.

18.3 Foreign Key Includes Tenant

Use composite references when possible.

create table case_tasks (
    tenant_id text not null,
    case_id uuid not null,
    task_id uuid not null,
    primary key (tenant_id, task_id),
    foreign key (tenant_id, case_id) references cases (tenant_id, case_id)
);

This prevents linking a task in tenant A to a case in tenant B.

18.4 Row-Level Security

Database row-level security can provide defense-in-depth.

But it does not replace application authorization.

Use it for:

  • strong tenant predicates;
  • accidental query omission protection;
  • analytics/reporting safety;
  • privileged support tool containment.

Be careful with:

  • connection pooling;
  • transaction-local tenant variables;
  • background jobs;
  • migrations;
  • admin bypass roles.

18.5 Search Index Isolation

Search indexes are frequent tenant escape points.

Invariants:

- Every indexed document includes tenant_id.
- Every search query includes tenant filter.
- Every facet/count/aggregation includes tenant filter.
- Reindex jobs preserve tenant_id.
- Cross-tenant admin search is separate and audited.

19. Tenant-Safe Cache Design

Cache keys must include tenant.

Bad:

@Cacheable("caseSummary")
public CaseSummary summary(UUID caseId) { ... }

Better:

@Cacheable(cacheNames = "caseSummary", key = "#tenantId.value() + ':' + #caseId")
public CaseSummary summary(TenantId tenantId, UUID caseId) { ... }

But do not rely only on annotation expressions.

For high-risk caches, create typed cache keys:

public record TenantCacheKey(TenantId tenantId, String namespace, String key) {
    public TenantCacheKey {
        Objects.requireNonNull(tenantId);
        Objects.requireNonNull(namespace);
        Objects.requireNonNull(key);
    }
}

Cache invalidation must also be tenant-aware.

cache.evict(new TenantCacheKey(tenantId, "case", caseId.toString()));

Cross-tenant cache bugs are often invisible because everything looks fast and correct until another tenant sees the data.

20. Tenant-Safe Event Design

Events need tenant identity.

{
  "eventId": "evt-123",
  "tenantId": "tenant-a",
  "type": "case.assigned",
  "subject": "case-999",
  "occurredAt": "2026-06-28T12:00:00Z",
  "actor": {
    "type": "user",
    "subject": "user-123"
  }
}

Event invariants:

- Every tenant-scoped event includes tenantId.
- Consumers reject tenant-scoped events without tenantId.
- Consumer authorization uses event tenant, not consumer default tenant.
- Dead-letter queues preserve tenantId.
- Replay jobs preserve tenantId.
- Event subscriptions are tenant-scoped unless explicitly platform-scoped.

Bad consumer:

void handle(CaseAssigned event) {
    TenantContext.set(currentTenantFromThreadLocal());
    projection.update(event.caseId(), event.assignee());
}

Better:

void handle(CaseAssigned event) {
    TenantId tenantId = event.tenantId();
    try (TenantScope ignored = tenantContext.open(tenantId)) {
        projection.update(tenantId, event.caseId(), event.assignee());
    }
}

21. Tenant Context Propagation

Tenant context travels differently depending on execution model.

BoundaryPropagation StrategyCommon Failure
HTTP requestSecurity context + route/host/tokenTrusting spoofable header.
Service-to-serviceToken audience + tenant claim + mTLS/workload identityReusing end-user token everywhere.
Message eventTenant field in event envelopeEvent without tenant processed by default tenant.
Background jobJob payload includes tenant; worker validates tenant activeScheduler runs global job with missing tenant.
CacheTenant in key and namespaceKey collision across tenants.
Logging/tracingTenant in MDC/span attributeMissing tenant in async thread.
DatabaseQuery predicate/session variableConnection pool leaks previous tenant variable.

21.1 ThreadLocal Caution

ThreadLocal can be acceptable in servlet code if managed carefully.

public final class TenantContextHolder {
    private static final ThreadLocal<TenantContext> CURRENT = new ThreadLocal<>();

    public static TenantContext current() {
        TenantContext context = CURRENT.get();
        if (context == null) {
            throw new IllegalStateException("No tenant context bound");
        }
        return context;
    }

    public static TenantScope open(TenantContext context) {
        TenantContext previous = CURRENT.get();
        CURRENT.set(context);
        return () -> {
            if (previous == null) {
                CURRENT.remove();
            } else {
                CURRENT.set(previous);
            }
        };
    }
}
@FunctionalInterface
public interface TenantScope extends AutoCloseable {
    @Override
    void close();
}

Usage:

try (TenantScope ignored = TenantContextHolder.open(context)) {
    chain.doFilter(request, response);
}

Risks:

  • thread reuse leaks context;
  • async execution loses context;
  • reactive pipelines do not use ThreadLocal reliably;
  • tests pass but production fails under concurrency.

For reactive systems, use Reactor context instead of ThreadLocal.

22. Tenant-Aware Background Jobs

Background jobs are not user requests.

They still require tenant context.

Bad:

@Scheduled(cron = "0 0 * * * *")
void closeExpiredCases() {
    caseRepository.findExpiredCases().forEach(caseService::close);
}

Better:

@Scheduled(cron = "0 0 * * * *")
void closeExpiredCases() {
    tenantRepository.findActiveTenants().forEach(tenant -> {
        JobActor actor = JobActor.system("case-expiry-job");
        TenantContext context = new TenantContext(tenant.id(), TenantResolutionSource.CLIENT_REGISTRATION, false, correlationId());

        try (TenantScope ignored = tenantContextHolder.open(context)) {
            caseRepository.findExpiredCases(tenant.id())
                    .forEach(caze -> caseService.closeExpired(actor, tenant.id(), caze.id()));
        }
    });
}

Job invariants:

- no tenant-scoped job runs without tenant context;
- job actor is explicit;
- job permissions are limited;
- job emits audit events;
- job failure is attributable to tenant;
- job retry preserves tenant;
- tenant disabled/offboarded state is respected.

23. Tenant Switching

A user belonging to multiple tenants may switch active tenant.

Safe tenant switching requires:

  1. user is authenticated;
  2. target tenant exists and is active;
  3. user has active membership in target tenant;
  4. active tenant session/token changes atomically;
  5. old tenant caches/UI state are invalidated;
  6. switch is audited when sensitive;
  7. token/session reflects target tenant clearly.

Example switch endpoint:

@PostMapping("/session/active-tenant")
public ResponseEntity<SwitchTenantResponse> switchTenant(@AuthenticationPrincipal CurrentUser user,
                                                         @RequestBody SwitchTenantRequest request) {
    TenantId target = new TenantId(request.tenantId());

    TenantMembership membership = membershipService.findActiveMembership(user.subject(), target)
            .orElseThrow(() -> new AccessDeniedException("Not a member of tenant"));

    SessionToken newSession = sessionService.rotateActiveTenant(user.sessionId(), target);

    audit.log(AuditEvent.tenantSwitched(user.subject(), target, membership.roles()));

    return ResponseEntity.ok(new SwitchTenantResponse(target.value(), newSession.expiresAt()));
}

Do not implement tenant switch as a UI-only variable in local storage.

24. Admin and Support Cross-Tenant Access

Support/admin flows are high risk.

There are at least four different cases:

FlowMeaningRisk
Platform admin viewStaff views tenant metadata/admin tools.Broad visibility.
Support acting-as tenantStaff performs support action in tenant context.Confused audit trail.
ImpersonationStaff appears as user or operates through user's effective permissions.Non-repudiation loss.
Break-glassEmergency access beyond normal grants.Abuse and compliance risk.

Do not implement all of these as:

ROLE_SUPER_ADMIN

Use explicit modes.

public enum AccessMode {
    NORMAL_USER,
    PLATFORM_ADMIN,
    SUPPORT_ACTING_AS_TENANT,
    USER_IMPERSONATION,
    BREAK_GLASS
}

Every non-normal mode requires stronger audit.

Part 020 goes deeper into delegation and impersonation.

25. Tenant Offboarding and Disable

Tenant lifecycle is security lifecycle.

Tenant states:

public enum TenantStatus {
    PROVISIONING,
    ACTIVE,
    SUSPENDED,
    READ_ONLY,
    OFFBOARDING,
    DISABLED,
    DELETED
}

Rules:

StatusLoginAPI ReadAPI WriteJobsEvents
ACTIVEYesYesYesYesYes
SUSPENDEDMaybe admin onlyMaybeNoLimitedLimited
READ_ONLYYesYesNoRead/maintenance onlyNo mutations
OFFBOARDINGNo new sessionsExport onlyNoExport/delete workflowsControlled
DISABLEDNoNoNoNo tenant jobsNo normal events
DELETEDNoNoNoNoNo

Resource server must consider tenant status, not only token validity.

If tenant is disabled, previously issued tokens may still be cryptographically valid.

Therefore high-risk platforms need either:

  • short-lived access tokens;
  • introspection/reference tokens;
  • tenant status check during authorization;
  • revocation/event-driven cache invalidation;
  • emergency deny list.

26. Multi-Tenant Audit Records

Audit events must include tenant context.

public record SecurityDecisionAudit(
        String eventId,
        Instant occurredAt,
        TenantId tenantId,
        String subject,
        String clientId,
        String issuer,
        String action,
        String resourceType,
        String resourceId,
        String decision,
        String policyId,
        String reasonCode,
        AccessMode accessMode,
        String correlationId
) {
}

For cross-tenant admin actions, include both:

actor home tenant / staff org
and target tenant

Example:

{
  "actorSubject": "staff-123",
  "actorOrg": "platform-security",
  "targetTenant": "tenant-a",
  "accessMode": "SUPPORT_ACTING_AS_TENANT",
  "action": "case.read",
  "resource": "case-999",
  "reasonCode": "customer-ticket-456",
  "decision": "PERMIT"
}

27. Observability

Tenant-safe metrics:

security_authentication_failures_total{reason="unknown_tenant"}
security_token_validation_failures_total{reason="issuer_mismatch"}
security_authorization_denied_total{reason="tenant_membership_missing"}
security_tenant_conflict_total{source="path_token"}
api_requests_total{tenant_tier="enterprise", outcome="success"}

Be careful with high-cardinality tenant labels.

For metrics systems, avoid raw tenant IDs unless cardinality is controlled and allowed.

For logs/traces, tenant ID is usually needed but must comply with privacy and security policy.

28. Tenant Escape Failure Modes

28.1 Header-Based Tenant Spoofing

GET /cases
X-Tenant-Id: victim-tenant
Authorization: Bearer valid-token-for-attacker

If the API trusts X-Tenant-Id, attacker can switch tenant.

Mitigation:

  • reject public tenant headers;
  • use signed/internal gateway metadata;
  • cross-check token membership;
  • route tenant must match token/active tenant.

28.2 Cache Key Missing Tenant

Tenant A requests case 123; response cached under case:123.

Tenant B also has case 123; receives Tenant A data.

Mitigation:

  • typed tenant cache keys;
  • cache tests with same object ID across tenants;
  • include tenant in invalidation.

28.3 Global Role Interpreted Locally

Token includes admin.

API treats it as admin for every tenant.

Mitigation:

  • tenant-scoped roles;
  • membership service;
  • policy input includes active tenant;
  • no global admin except explicit platform modes.

28.4 Stale Membership Claim

User removed from tenant, but JWT valid for 30 minutes.

Mitigation:

  • short-lived tokens;
  • introspection for high-risk operations;
  • membership version claim;
  • policy check against current membership;
  • revocation on removal.

28.5 Issuer Confusion

Tenant A token accepted for Tenant B because both issuers use similar claim structure.

Mitigation:

  • issuer allowlist per tenant;
  • strict issuer and audience validation;
  • tenant claim validator;
  • no dynamic trust from token iss alone.

28.6 Search Aggregation Leak

Search results are filtered by tenant, but facets/counts are global.

Mitigation:

  • tenant filter in query and aggregations;
  • test counts/facets explicitly;
  • separate admin index.

28.7 Background Job Without Tenant

A scheduled job processes all expired cases and writes global results.

Mitigation:

  • job payload includes tenant;
  • jobs iterate tenants explicitly;
  • repository methods require tenant;
  • job audit includes tenant.

28.8 Tenant Migration Bug

Moving a workspace or account across tenants leaves related children under old tenant.

Mitigation:

  • migration transaction checks entire graph;
  • composite foreign keys;
  • post-migration invariant scans;
  • immutable tenant ownership for most resources.

29. Security Invariants

Use these during design review.

1. Every tenant-scoped request has exactly one active tenant.
2. Tenant assertions from multiple sources must match or fail.
3. Public request headers cannot define tenant unless cryptographically/internal-gateway verified.
4. Token issuer must be trusted for the active tenant.
5. Token audience must include the target API.
6. Client must be allowed for the tenant and operation.
7. Subject must have active membership or delegated authority for tenant.
8. Resource tenant must equal active tenant for tenant-scoped operations.
9. Data queries must constrain tenant before returning result sets.
10. Cache keys must include tenant for tenant-scoped data.
11. Events must include tenant and consumers must preserve it.
12. Jobs must run under explicit tenant context.
13. Audit logs must include tenant, actor, client, action, resource, decision, and access mode.
14. Tenant status changes must affect authorization even if tokens remain cryptographically valid.
15. Cross-tenant admin access must use explicit mode, justification, and enhanced audit.

30. Testing Strategy

30.1 Tenant Matrix Test

Create two tenants with identical IDs for resources.

tenant-a: case-001
tenant-b: case-001

Then test:

ScenarioExpected
User A reads tenant A casePermit
User A reads tenant B case by changing pathDeny/404
User A sends token tenant A with path tenant B401/403
User A sends X-Tenant-Id: tenant-bIgnored or rejected
User A lists casesOnly tenant A
User A search facetsOnly tenant A counts
User A exportOnly tenant A
User A cache hit after tenant B requestNo cross-tenant data

30.2 Spring MockMvc Example

@Test
void userCannotReadCaseFromAnotherTenant() throws Exception {
    TenantId tenantA = new TenantId("tenant-a");
    TenantId tenantB = new TenantId("tenant-b");
    UUID caseId = fixtures.caseInTenant(tenantB);

    Jwt jwt = jwtFor("user-a", tenantA, "case-api", List.of("case:read"));

    mvc.perform(get("/tenants/{tenantId}/cases/{caseId}", tenantA.value(), caseId)
                    .with(jwt().jwt(jwt)))
            .andExpect(status().isNotFound());
}

30.3 Conflict Test

@Test
void conflictingTenantAssertionsFailClosed() throws Exception {
    Jwt jwt = jwtFor("user-a", new TenantId("tenant-a"), "case-api", List.of("case:read"));

    mvc.perform(get("/tenants/tenant-b/cases")
                    .with(jwt().jwt(jwt)))
            .andExpect(status().isUnauthorized());
}

30.4 Cache Test

@Test
void cacheKeyIncludesTenant() {
    UUID caseId = UUID.fromString("00000000-0000-0000-0000-000000000001");

    fixtures.caseWithId(new TenantId("tenant-a"), caseId, "A secret");
    fixtures.caseWithId(new TenantId("tenant-b"), caseId, "B secret");

    CaseSummary a = service.summary(new TenantId("tenant-a"), caseId);
    CaseSummary b = service.summary(new TenantId("tenant-b"), caseId);

    assertThat(a.title()).isEqualTo("A secret");
    assertThat(b.title()).isEqualTo("B secret");
}

30.5 Event Replay Test

@Test
void eventReplayPreservesTenantContext() {
    CaseAssigned event = new CaseAssigned(
            new TenantId("tenant-a"),
            caseId,
            "assignee-1",
            Instant.now()
    );

    eventHandler.handle(event);

    assertThat(projection.find(new TenantId("tenant-a"), caseId)).isPresent();
    assertThat(projection.find(new TenantId("tenant-b"), caseId)).isEmpty();
}

31. Review Checklist

Use this checklist before shipping tenant-scoped APIs.

Identity and Token

  • Is active tenant explicit?
  • Are tenant assertions conflict-checked?
  • Is issuer trusted for the tenant?
  • Is audience validated?
  • Are client registrations tenant-aware?
  • Are stale memberships handled?
  • Can tenant disable stop new access quickly?

Authorization

  • Is subject membership checked?
  • Is resource tenant checked?
  • Are create operations deriving tenant from trusted context?
  • Are update/delete operations loading by tenant + ID?
  • Are admin/support modes explicit?
  • Is cross-tenant access audited?

Data

  • Do repositories require tenant for tenant-scoped resources?
  • Are list/search/export/count/facet operations tenant-filtered?
  • Do foreign keys prevent cross-tenant child links?
  • Are caches tenant-keyed?
  • Are search indexes tenant-filtered?

Async and Ops

  • Do events include tenant?
  • Do consumers preserve tenant?
  • Do jobs run with explicit tenant context?
  • Are logs/traces tenant-aware?
  • Are tenant onboarding/offboarding changes audited?

32. Practice Drill

Design a tenant model for this scenario:

A regulatory case management SaaS serves multiple agencies.
Some agencies use platform login.
Some agencies require their own enterprise IdP.
Users can belong to multiple agencies.
Support staff may help an agency only after a ticket is approved.
Cases, tasks, attachments, audit records, and exports are tenant-scoped.
A nightly job closes stale draft cases.
A search service indexes case summaries.

Deliverables:

  1. tenant topology choice;
  2. token claims;
  3. tenant resolution rules;
  4. tenant-aware principal model;
  5. repository method naming convention;
  6. event envelope;
  7. cache key format;
  8. support access mode;
  9. tenant escape test matrix;
  10. operational disable plan.

33. Key Takeaways

  • Tenant is a security boundary, not a UI filter.
  • Authentication, tenant resolution, and authorization are separate steps.
  • A valid token can still be invalid for a requested tenant.
  • A route tenant can still conflict with the resource tenant.
  • Tenant must propagate to data, cache, event, job, audit, and observability layers.
  • Multi-issuer systems require issuer allowlists and deterministic resolver logic.
  • Most tenant escape bugs are caused by missing tenant binding in one non-obvious layer.

34. References

Lesson Recap

You just completed lesson 19 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.