Security, ACL, SASL, mTLS, and Governance
Learn Java Kafka in Action - Part 028
Security, ACL, SASL, mTLS, and governance for production Java Kafka systems: threat modeling, encryption, authentication, authorization, service identity, tenant isolation, secrets rotation, topic governance, data classification, auditability, and security operations.
Part 028 — Security, ACL, SASL, mTLS, and Governance
Part 027 covered multi-service transaction boundaries.
Now we move to production security and governance.
Kafka security is not one switch. It is a layered model:
- Network boundary.
- Encryption in transit.
- Authentication.
- Authorization.
- Service identity.
- Topic governance.
- Schema governance.
- Secrets management.
- Audit logging.
- Operational response.
The central idea:
Kafka is a shared distributed log. If identity, authorization, and governance are weak, Kafka becomes a high-throughput data exfiltration and corruption platform.
A top-level Kafka engineer should not only know how to set security.protocol=SASL_SSL. They should be able to design who can produce, consume, create topics, alter configs, read schemas, run connectors, execute ksqlDB queries, and access regulated event data.
1. Kaufman Skill Decomposition
The target skill is securing and governing Kafka as a multi-service, multi-team, production data platform.
| Subskill | Production Meaning |
|---|---|
| Threat modeling | Identify what can go wrong before choosing controls. |
| TLS/mTLS | Encrypt traffic and optionally authenticate clients with certificates. |
| SASL | Authenticate clients using mechanisms such as SCRAM, OAuth/OIDC, Kerberos, or PLAIN over TLS. |
| Principal modeling | Map service identity to Kafka permissions. |
| ACL design | Grant least-privilege access to topics, groups, transactional IDs, and cluster operations. |
| Tenant isolation | Prevent cross-domain data access and operational blast radius. |
| Topic governance | Control naming, ownership, retention, compaction, schema, and lifecycle. |
| Secrets rotation | Rotate credentials/certificates without outage. |
| Auditability | Track who accessed or changed what. |
| Incident response | Revoke access, quarantine topics, and investigate suspicious activity. |
1.1 Practice Goal
By the end of this part, you should be able to:
- Draw a Kafka threat model.
- Choose between mTLS, SASL/SCRAM, OAuth/OIDC, and managed identity patterns.
- Design ACLs for producer, consumer, Kafka Streams, Connect, and ksqlDB workloads.
- Avoid dangerous wildcard permissions.
- Build a topic governance model with owners and data classification.
- Define a credential rotation and incident-response runbook.
- Review Java Kafka client security configuration.
2. Threat Model First
Security design should start with failure modes, not configuration snippets.
2.1 Assets
| Asset | Why It Matters |
|---|---|
| Topic data | May contain customer, financial, regulated, or operational data. |
| Consumer group offsets | Can reveal processing behavior and enable unauthorized replay. |
| Schema Registry | Can expose event structure and sensitive fields. |
| Connect connectors | Can move large volumes of data into or out of external systems. |
| ksqlDB queries | Can derive, join, and expose data from multiple streams. |
| Broker configs | Can weaken durability, retention, or access control. |
| Credentials | Can impersonate producers, consumers, or admins. |
2.2 Threats
| Threat | Example |
|---|---|
| Unauthorized produce | Malicious service publishes fake PaymentReceived event. |
| Unauthorized consume | Team reads customer.pii.v1 without approval. |
| Topic squatting | Service creates topic name before platform governance. |
| Schema poisoning | Producer registers incompatible or misleading schema. |
| Data exfiltration | Connector streams sensitive topic to unauthorized sink. |
| Credential leakage | SASL password in Git repository. |
| Principal reuse | Many services share one Kafka user, destroying accountability. |
| ACL drift | Temporary broad access remains forever. |
| Replay abuse | Consumer reads old events for unauthorized reconstruction. |
| Admin misuse | User alters retention or deletes topic. |
2.3 Security Objective
A secure Kafka platform should provide:
- Confidentiality — unauthorized parties cannot read data.
- Integrity — unauthorized parties cannot produce or alter data/config.
- Availability — security controls do not create fragile operations.
- Accountability — every access maps to a stable identity.
- Least privilege — permissions are scoped to the minimum needed.
- Recoverability — credentials can be revoked and rotated quickly.
- Governance — topics/schemas/connectors have owners and lifecycle rules.
3. Kafka Security Stack
| Layer | Control |
|---|---|
| Network | Private subnets, firewall, security groups, Kubernetes network policies. |
| Encryption | TLS for client-broker and broker-broker traffic. |
| Authentication | mTLS, SASL/SCRAM, SASL/OAUTHBEARER, Kerberos, managed identity. |
| Authorization | Kafka ACLs or platform RBAC. |
| Governance | Topic ownership, schema compatibility, data classification, connector review. |
| Audit | Broker logs, platform audit logs, IAM logs, SIEM integration. |
| Operations | Rotation, revocation, emergency ACL deny, incident playbook. |
Security is strongest when each layer assumes the previous one can fail.
4. Encryption in Transit: TLS
TLS protects traffic between clients and brokers, and between brokers if configured.
Without TLS, credentials and event payloads can be exposed on the network.
4.1 Listener Model
Kafka can expose multiple listeners for different audiences.
listeners=INTERNAL://broker-1:9092,EXTERNAL://broker-1.example.com:9094
advertised.listeners=INTERNAL://broker-1.kafka.svc:9092,EXTERNAL://broker-1.example.com:9094
listener.security.protocol.map=INTERNAL:SSL,EXTERNAL:SASL_SSL
inter.broker.listener.name=INTERNAL
Typical listener separation:
| Listener | Audience | Security |
|---|---|---|
| INTERNAL | Brokers and internal platform components | SSL or SASL_SSL |
| APPLICATION | Application services | SASL_SSL / mTLS |
| ADMIN | Platform operators | Strong auth, restricted network |
| REPLICATION | Cluster linking / replication | Dedicated identity and ACLs |
Do not expose broker listeners broadly without authentication and authorization.
5. Authentication Options
Authentication answers:
Who are you?
Kafka supports multiple authentication patterns depending on distribution and deployment.
5.1 mTLS
mTLS uses client certificates. Both server and client verify each other.
Pros:
- Strong cryptographic identity.
- No shared password in application config.
- Good for service-to-service environments.
- Works well with certificate automation.
Cons:
- Certificate lifecycle must be operated carefully.
- Principal mapping can be tricky.
- Rotation requires coordination.
- Debugging certificate failures can be painful.
5.2 SASL/SCRAM
SCRAM uses username/password-style credentials with challenge-response semantics.
Pros:
- Common and widely supported.
- Easier than mTLS for many application teams.
- Credentials can be rotated through secret stores.
Cons:
- Passwords must be protected.
- Credential sharing is common if governance is weak.
- Rotation discipline is required.
5.3 SASL/PLAIN
SASL/PLAIN sends username/password through SASL and should be used only over TLS.
Use it only when backed by appropriate enterprise identity integration and transport encryption.
5.4 SASL/OAUTHBEARER or OIDC
OAuth/OIDC allows token-based authentication integrated with identity providers.
Pros:
- Centralized identity.
- Short-lived credentials.
- Better integration with enterprise access control.
Cons:
- More moving parts.
- Token validation and claim mapping must be designed.
- Availability of identity provider becomes important.
5.5 Authentication Decision Matrix
| Requirement | Preferred Option |
|---|---|
| Strong service identity with cert automation | mTLS |
| Simpler app credentials | SASL/SCRAM over TLS |
| Enterprise identity provider integration | OAuth/OIDC |
| Legacy/simple internal setup | SASL/PLAIN over TLS only |
| Kerberos enterprise environment | SASL/GSSAPI |
| Cloud-managed Kafka | Provider IAM / managed identity if available |
6. Principal Modeling
A Kafka principal is the authenticated identity used for authorization.
Bad model:
User:kafka-app
used by every service.
Better model:
User:svc-quote-api-prod
User:svc-order-worker-prod
User:svc-risk-streams-prod
User:connect-crm-source-prod
User:ksqldb-analytics-prod
6.1 Identity Rules
- One runtime service should have one stable identity.
- Do not share credentials across unrelated services.
- Separate environments: dev, staging, prod.
- Separate humans from applications.
- Separate CI/CD deploy identity from runtime identity.
- Separate producer identity from admin identity.
- Expire temporary access automatically.
6.2 Principal Naming Convention
svc-<domain>-<component>-<env>
Examples:
svc-quote-api-prod
svc-order-saga-prod
svc-billing-outbox-relay-prod
svc-risk-streams-prod
svc-cdc-postgres-connect-prod
A naming convention improves auditability and ACL review.
7. Authorization with ACLs
Authorization answers:
What are you allowed to do?
Kafka ACLs grant or deny operations on resources.
Common resource types:
| Resource Type | Example |
|---|---|
| Topic | quote.approved.v1 |
| Group | order-service.quote-approved-handler |
| Cluster | cluster-level admin operations |
| TransactionalId | risk-score-worker-* |
| DelegationToken | token operations where supported |
Common operations:
| Operation | Meaning |
|---|---|
| Read | Consume from topic or read group metadata. |
| Write | Produce to topic. |
| Create | Create resource. |
| Delete | Delete resource. |
| Alter | Alter resource/config. |
| Describe | Inspect resource metadata. |
| DescribeConfigs | Read configs. |
| AlterConfigs | Change configs. |
| IdempotentWrite | Required for idempotent producer at cluster level in some configurations. |
7.1 Producer ACL
A producer usually needs:
Writeon target topic.Describeon target topic.IdempotentWriteif required by cluster authorization model.Write/Describeon transactional ID if using transactions.
Example intent:
Principal: User:svc-quote-api-prod
Allow Write, Describe on Topic:quote.approved.v1
7.2 Consumer ACL
A consumer usually needs:
Readon source topic.Describeon source topic.Readon consumer group.
Example intent:
Principal: User:svc-order-worker-prod
Allow Read, Describe on Topic:quote.approved.v1
Allow Read on Group:order-service.quote-approved-handler
7.3 Kafka Streams ACL
Kafka Streams applications may need more than simple consume/produce ACLs.
They often need:
- Read source topics.
- Write sink topics.
- Create/write/read internal repartition topics.
- Create/write/read changelog topics.
- Read consumer group based on
application.id. - Access transactional IDs if exactly-once is enabled.
For a Streams app with:
application.id=risk-score-streams-prod
internal topics may look like:
risk-score-streams-prod-<store-name>-changelog
risk-score-streams-prod-<node-name>-repartition
Design implication:
application.idis a security and governance boundary, not just a config string.
7.4 Kafka Connect ACL
Connect requires permissions for:
- Connector source/sink topics.
- Internal config topic.
- Internal offset topic.
- Internal status topic.
- Consumer groups.
- Producer writes.
- Sometimes topic creation.
Connect is high-risk because it can move data at scale.
Never grant broad wildcard topic access to a shared Connect principal unless the Connect cluster is itself strongly governed.
7.5 ksqlDB ACL
ksqlDB may need access to:
- Input topics.
- Output topics.
- Internal processing topics.
- Consumer groups.
- Command topic.
- Schema Registry subjects.
Because ksqlDB can join and materialize data, access to ksqlDB is effectively access to derived data products.
8. Least Privilege Design
8.1 Bad ACL Pattern
User:svc-order-api-prod has Read,Write on Topic:*
This is operationally convenient and architecturally dangerous.
Failure modes:
- Service can accidentally write to wrong topic.
- Compromised service can exfiltrate all topic data.
- Audit logs cannot prove intended access.
- Data classification becomes meaningless.
8.2 Better ACL Pattern
User:svc-order-api-prod:
Write,Describe Topic:order.created.v1
Write,Describe Topic:order.cancelled.v1
Read,Describe Topic:quote.approved.v1
Read Group:order-service.quote-approved-handler
8.3 Prefix ACLs
Prefix ACLs can reduce operational overhead, but use them carefully.
Good use:
User:svc-risk-streams-prod:
Read Topic:risk.input.
Write Topic:risk.output.
All Topic:risk-score-streams-prod-
Dangerous use:
User:svc-any-prod:
Read Topic:
Prefix ACLs should align with ownership boundaries.
9. Topic Governance
Security and governance meet at the topic.
A topic should not be just a string. It should have metadata.
9.1 Topic Metadata
| Field | Example |
|---|---|
| Name | quote.approved.v1 |
| Owner team | CPQ Platform |
| Domain | Quote |
| Data classification | Internal / Confidential / Restricted |
| Contains PII | Yes/No |
| Retention | 90 days |
| Cleanup policy | delete / compact / compact,delete |
| Schema subject | quote.approved.v1-value |
| Compatibility mode | backward/full/transitive depending on policy |
| Allowed producers | svc-quote-api-prod |
| Allowed consumers | svc-order-worker-prod, svc-analytics-prod |
| SLA/SLO | availability, freshness, lag |
| Replay policy | allowed, restricted, approval required |
9.2 Topic Naming Convention
Recommended domain-event convention:
<domain>.<entity-or-capability>.<event-name>.v<major>
Examples:
quote.quote-approved.v1
order.order-created.v1
billing.invoice-issued.v1
case.case-escalated.v1
Alternative compact convention:
<domain>.<event-name>.v<major>
Examples:
quote.approved.v1
order.created.v1
case.escalated.v1
The exact convention matters less than consistent ownership and metadata.
9.3 Topic Creation Policy
Do not allow arbitrary application topic creation in production unless the platform has strong guardrails.
Governed topic creation should validate:
- Name.
- Owner.
- Partition count.
- Replication factor.
- Retention.
- Cleanup policy.
- Schema requirement.
- Data classification.
- Allowed principals.
- Cost/blast-radius impact.
10. Data Classification
Kafka security must respect data sensitivity.
| Classification | Example | Control |
|---|---|---|
| Public | Product catalog events | Basic authz, normal retention |
| Internal | Operational status events | Team-scoped ACL |
| Confidential | Customer order events | Strict ACL, audit, limited consumers |
| Restricted | PII, enforcement, payment, legal events | Explicit approval, encryption, masking, strict retention |
10.1 PII and Sensitive Data
Avoid placing unnecessary sensitive fields in event payloads.
Bad:
{
"customerId": "C-991",
"fullName": "...",
"nationalId": "...",
"birthDate": "...",
"address": "...",
"quoteApproved": true
}
Better:
{
"customerId": "C-991",
"quoteId": "Q-881",
"approvedAt": "2026-07-01T10:00:00Z",
"riskBand": "STANDARD"
}
Use references when possible. Publish sensitive data only when the consumer genuinely needs it.
10.2 Field-Level Protection
For highly sensitive data, consider:
- Application-level encryption for selected fields.
- Tokenization.
- Data minimization.
- Separate restricted topics.
- Different retention policy.
- Strict consumer approval.
- Audit logging of consumption.
Kafka topic ACLs protect topic access, not individual fields.
11. Schema Registry Governance
Schema Registry is part of the security boundary.
A consumer that can access a schema may infer business meaning even without current data access.
Governance should define:
- Who can register schemas.
- Who can read schemas.
- Compatibility modes per subject.
- Review process for breaking changes.
- Sensitive field naming rules.
- Deprecation policy.
11.1 Schema Subject Ownership
| Subject | Owner | Compatibility |
|---|---|---|
quote.approved.v1-value | CPQ Platform | BACKWARD or FULL depending on consumer model |
order.created.v1-value | Order Platform | BACKWARD |
case.escalated.v1-value | Enforcement Platform | FULL_TRANSITIVE for stricter audit needs |
Schema governance should be integrated with CI/CD.
12. Java Client Security Configuration
12.1 SASL/SCRAM over TLS
bootstrap.servers=kafka-1.example.com:9094,kafka-2.example.com:9094
security.protocol=SASL_SSL
sasl.mechanism=SCRAM-SHA-512
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
username="svc-order-worker-prod" \
password="${KAFKA_PASSWORD}";
ssl.truststore.location=/etc/kafka/secrets/truststore.p12
ssl.truststore.password=${TRUSTSTORE_PASSWORD}
ssl.truststore.type=PKCS12
Do not hardcode secrets in source code.
12.2 mTLS Client
bootstrap.servers=kafka-1.example.com:9094,kafka-2.example.com:9094
security.protocol=SSL
ssl.truststore.location=/etc/kafka/secrets/truststore.p12
ssl.truststore.password=${TRUSTSTORE_PASSWORD}
ssl.truststore.type=PKCS12
ssl.keystore.location=/etc/kafka/secrets/keystore.p12
ssl.keystore.password=${KEYSTORE_PASSWORD}
ssl.keystore.type=PKCS12
ssl.key.password=${KEY_PASSWORD}
12.3 Java Properties Builder
public final class KafkaSecurityProperties {
public static Properties scramSsl(
String bootstrapServers,
String username,
String password,
Path truststorePath,
String truststorePassword
) {
Properties props = new Properties();
props.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
props.put(SaslConfigs.SASL_MECHANISM, "SCRAM-SHA-512");
props.put(SaslConfigs.SASL_JAAS_CONFIG,
"org.apache.kafka.common.security.scram.ScramLoginModule required " +
"username=\"" + username + "\" " +
"password=\"" + password + "\";");
props.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG, truststorePath.toString());
props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, truststorePassword);
props.put(SslConfigs.SSL_TRUSTSTORE_TYPE_CONFIG, "PKCS12");
return props;
}
}
In production, prefer secret injection through your platform secret manager, not plain environment variables if stronger mechanisms are available.
13. Broker-Side Security Configuration Sketch
This is a simplified illustration, not a complete production config.
process.roles=broker
listeners=SASL_SSL://broker-1:9094
advertised.listeners=SASL_SSL://broker-1.example.com:9094
listener.security.protocol.map=SASL_SSL:SASL_SSL
ssl.keystore.location=/etc/kafka/secrets/broker.keystore.p12
ssl.keystore.password=${KEYSTORE_PASSWORD}
ssl.keystore.type=PKCS12
ssl.truststore.location=/etc/kafka/secrets/broker.truststore.p12
ssl.truststore.password=${TRUSTSTORE_PASSWORD}
ssl.truststore.type=PKCS12
sasl.enabled.mechanisms=SCRAM-SHA-512
listener.name.sasl_ssl.scram-sha-512.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required;
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
allow.everyone.if.no.acl.found=false
super.users=User:platform-kafka-admin
Important production notes:
- Do not set
allow.everyone.if.no.acl.found=truein production unless you have a very specific migration plan. - Protect super users aggressively.
- Separate admin listener/network where possible.
- Validate inter-broker and controller listener security separately.
- Keep controller quorum security in scope for KRaft clusters.
14. ACL Examples as Intent
Exact CLI syntax varies by version/distribution, but the intent should be clear.
14.1 Producer
kafka-acls \
--bootstrap-server kafka.example.com:9094 \
--command-config admin.properties \
--add \
--allow-principal User:svc-quote-api-prod \
--operation Write \
--operation Describe \
--topic quote.approved.v1
14.2 Consumer
kafka-acls \
--bootstrap-server kafka.example.com:9094 \
--command-config admin.properties \
--add \
--allow-principal User:svc-order-worker-prod \
--operation Read \
--operation Describe \
--topic quote.approved.v1
kafka-acls \
--bootstrap-server kafka.example.com:9094 \
--command-config admin.properties \
--add \
--allow-principal User:svc-order-worker-prod \
--operation Read \
--group order-service.quote-approved-handler
14.3 Kafka Streams Internal Topics
kafka-acls \
--bootstrap-server kafka.example.com:9094 \
--command-config admin.properties \
--add \
--allow-principal User:svc-risk-streams-prod \
--operation All \
--topic risk-score-streams-prod- \
--resource-pattern-type prefixed
Be careful: All on a prefix is acceptable only when the prefix is exclusively owned by that application.
15. Governance for Kafka Streams, Connect, and ksqlDB
15.1 Kafka Streams Governance
Require each Streams application to declare:
application.id.- Source topics.
- Sink topics.
- Internal topic prefix.
- State stores.
- Exactly-once setting.
- Data classification.
- Restore-time expectation.
- Owner and on-call.
15.2 Kafka Connect Governance
Require each connector to declare:
- Connector owner.
- Source/sink system.
- Topic list or pattern.
- Data classification.
- Error handling and DLQ policy.
- Secret references.
- Connector principal.
- External system permission.
- Backfill/replay policy.
- Decommission plan.
15.3 ksqlDB Governance
Require each persistent query to declare:
- Query owner.
- Input streams/tables.
- Output topic/table.
- Join sources.
- Data classification of output.
- Retention.
- Query version.
- Rollback plan.
- Access policy for pull/push query.
ksqlDB can create derived data products. Govern output topics as seriously as source topics.
16. Multi-Tenant Kafka
Multi-tenancy can mean different things.
| Model | Description | Isolation Strength |
|---|---|---|
| Shared cluster, shared topics | Multiple teams share topic space | Weak |
| Shared cluster, topic prefixes | Teams own prefixes/domains | Medium |
| Shared cluster, separate principals and quotas | ACL + quota isolation | Medium/Strong |
| Separate clusters by environment/domain | Stronger blast-radius control | Strong |
| Managed Kafka with IAM/RBAC | Depends on provider configuration | Medium/Strong |
16.1 Tenant Isolation Controls
- Topic prefix ownership.
- Principal per service/team.
- ACL per topic/group/transactional ID.
- Client quotas.
- Network segmentation.
- Schema subject ownership.
- Connector isolation.
- ksqlDB access separation.
- Audit logs.
- Data classification review.
16.2 Quotas
Quotas protect availability.
Use quotas to prevent one tenant/service from exhausting:
- Producer bandwidth.
- Consumer bandwidth.
- Request rate.
- Controller/admin operation capacity.
Security includes availability protection, not only confidentiality.
17. Secrets Management and Rotation
17.1 Bad Practices
- Credentials in Git.
- Credentials in container image.
- Shared service account across many apps.
- Manual untracked password rotation.
- Long-lived certificates without renewal process.
- Admin credentials available to application runtime.
17.2 Better Practices
- Use a secret manager.
- Inject secrets at runtime.
- Use separate identities per workload.
- Rotate credentials regularly.
- Automate certificate issuance/renewal.
- Alert on expiring certificates.
- Revoke unused principals.
- Keep admin credentials separate.
17.3 Rotation Runbook
Rotation should be observable:
- Which apps switched?
- Which apps still use old credentials?
- Are there authentication failures?
- Is any old credential still active after deadline?
18. Security Observability
18.1 Metrics and Logs
| Signal | Meaning |
|---|---|
| Authentication failure count | Bad credentials, attack, expired cert. |
| Authorization failure count | Missing ACL, suspicious access attempt. |
| Topic creation events | Governance bypass or deployment activity. |
| ACL changes | Permission drift or emergency access. |
| Consumer group creation | New data access path. |
| Connector creation/update | Potential data movement. |
| ksqlDB query creation | Derived data access. |
| Schema registration | Contract evolution or schema poisoning risk. |
| Certificate expiry | Upcoming outage risk. |
18.2 Audit Questions
A Kafka audit trail should answer:
- Who produced to this topic?
- Who consumed from this topic?
- Who changed topic configuration?
- Who created or deleted topics?
- Who changed ACLs?
- Who registered schema versions?
- Who created connectors or ksqlDB queries?
- Which credential was used?
- From which host/network?
- At what time?
19. Incident Response
19.1 Credential Leak
Steps:
- Identify leaked principal.
- Revoke or disable credential.
- Rotate replacement credential.
- Review ACLs for leaked principal.
- Search audit logs for suspicious access.
- Check consumed topics and produced topics.
- Reprocess or quarantine suspicious events if needed.
- Document timeline.
- Add preventive control.
19.2 Unauthorized Produce
Steps:
- Stop offending principal.
- Identify topic, partition, offset range.
- Determine event types affected.
- Quarantine or mark suspicious records through downstream controls.
- Publish correction/supersession events if needed.
- Rebuild projections from trusted offset range if required.
- Tighten ACLs and schema registration controls.
19.3 Unauthorized Consume
Steps:
- Revoke read ACL.
- Identify data classification of consumed topics.
- Determine offset range and time window.
- Review logs for source IP/host.
- Notify security/privacy stakeholders according to policy.
- Rotate affected credentials if necessary.
- Review why access was granted.
20. Production Security Checklist
20.1 Cluster
- TLS enabled for client-broker traffic.
- Broker-broker/controller traffic secured.
- Authentication enabled.
- Authorization enabled.
-
allow.everyone.if.no.acl.found=false. - Super users minimized.
- Admin access separated from app access.
- Audit logs retained.
- Topic auto-creation policy reviewed.
- Quotas configured for critical tenants.
20.2 Application
- Unique principal per service.
- No shared runtime credentials.
- Secrets stored in secret manager.
- Client uses TLS certificate validation.
- ACLs limited to required topics/groups.
- Transactional ID ACL scoped if transactions are used.
- Consumer group names are service-specific.
- No wildcard access unless justified by ADR.
- Logs do not print secrets or sensitive payloads.
20.3 Topic
- Owner assigned.
- Data classification assigned.
- Retention reviewed.
- Cleanup policy reviewed.
- Schema subject configured.
- Compatibility mode selected.
- Producer list approved.
- Consumer list approved.
- Replay policy documented.
20.4 Connect and ksqlDB
- Dedicated principal per Connect cluster or connector class.
- Internal topics secured.
- Connector configs do not expose secrets.
- DLQ topics governed.
- ksqlDB command/internal/output topics secured.
- Persistent queries reviewed as data products.
21. Architecture Review Questions
Ask these before approving a Kafka security design:
- What principal does each service use?
- Can that principal read or write anything outside its domain?
- Can any app create topics in production?
- Who can register schemas?
- Who can run ksqlDB queries?
- Who can create or update connectors?
- What happens when a credential leaks?
- Can we rotate secrets without outage?
- Can we prove who consumed restricted data?
- Are internal topics protected?
- Are topic prefixes aligned with ownership?
- Are wildcard ACLs justified and time-bound?
- Is PII minimized in event payloads?
- Is retention compatible with privacy and audit policy?
- Are admin and runtime identities separate?
22. Anti-Patterns
22.1 One User for All Services
This destroys accountability and makes least privilege impossible.
22.2 Wildcard Read on All Topics
This turns Kafka into a data lake with no access boundary.
22.3 TLS Without Authorization
Encryption protects transport. It does not decide who may read/write.
22.4 ACLs Without Ownership Metadata
Permissions drift when nobody owns topics.
22.5 Connect Cluster with Broad Access
A compromised connector can exfiltrate entire domains.
22.6 ksqlDB as Ungoverned Analytics Backdoor
ksqlDB can join and materialize sensitive derived data. Govern it.
22.7 Secrets in App Config Repository
This is a security incident waiting to happen.
22.8 No Credential Rotation Test
A rotation process that has never been tested is only a document.
23. Deliberate Practice
Exercise 1 — Threat Model a Kafka Domain
Choose one domain such as order, billing, case management, or customer profile.
Document:
- Topics.
- Producers.
- Consumers.
- Sensitive fields.
- Unauthorized produce impact.
- Unauthorized consume impact.
- Admin misuse impact.
- Credential leak response.
Exercise 2 — Write Least-Privilege ACL Intent
For one service, list:
- Topics it produces.
- Topics it consumes.
- Consumer group.
- Transactional ID if any.
- Internal topics if Kafka Streams.
- Required operations only.
Exercise 3 — Design Topic Metadata
For one topic, write a topic governance record:
- Name.
- Owner.
- Data classification.
- Retention.
- Schema subject.
- Compatibility mode.
- Allowed producers.
- Allowed consumers.
- Replay policy.
Exercise 4 — Credential Rotation Simulation
Design and test a rotation plan:
- Create new credential.
- Deploy app with new credential.
- Verify authentication success.
- Revoke old credential.
- Confirm old credential fails.
- Check monitoring alerts.
Exercise 5 — ksqlDB Governance Review
Pick one persistent query and answer:
- What data does it read?
- What derived data does it create?
- Who can query the result?
- Does the output contain more sensitive data than the input?
- What is the retention and deletion policy?
24. Mental Model Summary
Kafka security is a layered control system.
Key conclusions:
- Kafka security starts with threat modeling.
- TLS encrypts traffic but does not replace authentication or authorization.
- Each service should have a distinct principal.
- ACLs should be least-privilege and aligned with topic ownership.
- Kafka Streams, Connect, and ksqlDB need special governance because they create internal topics and derived data paths.
- Topic metadata is part of the security model.
- Sensitive fields require data minimization and sometimes field-level protection.
- Secrets rotation and incident response must be practiced, not only documented.
- Wildcards are governance debt unless tightly scoped and justified.
- Kafka is a data platform; secure it like one.
25. References
- Apache Kafka Documentation — Security: https://kafka.apache.org/documentation/#security
- Apache Kafka Documentation — Authorization and ACLs: https://kafka.apache.org/43/security/authorization-and-acls/
- Confluent Documentation — Security Overview: https://docs.confluent.io/platform/current/security/overview.html
- Confluent Documentation — ACL Authorization: https://docs.confluent.io/platform/current/security/authorization/acls/overview.html
- Confluent Documentation — SASL/PLAIN: https://docs.confluent.io/platform/current/security/authentication/sasl/plain/overview.html
- Confluent Documentation — TLS/mTLS Authentication: https://docs.confluent.io/platform/current/security/authentication/mutual-tls/overview.html
26. What Comes Next
Part 029 moves into production observability:
How do we observe Kafka brokers, producers, consumers, Kafka Streams tasks, Connect workers, ksqlDB queries, lag, rebalances, errors, traces, and SLOs?
You just completed lesson 28 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.