Series/Learn Kubernetes with Cloud Services AWS & Azure

Deepen PracticeOrdered learning track

Security Context, Pod Security, and Admission

Learn Kubernetes with Cloud Services AWS & Azure - Part 023

Production-grade security context, Pod Security Standards, Pod Security Admission, and admission-control design for Kubernetes workloads on EKS and AKS.

[2026-07-03]15 min read2834 words

In This Lesson

1. The security model in one sentence 2. What problem `securityContext` actually solves 3. The production baseline: restricted-by-default workload

PrevNext

Lesson 2340 lesson track23–33 Deepen Practice

#kubernetes#security#pod-security#admission-control+3 more

Part 023 — Security Context, Pod Security, and Admission

Kubernetes security is not one setting. It is a chain of constraints.

The common beginner mistake is to think:

"We use private clusters and RBAC, so workloads are secure."

That is false.

RBAC controls who can ask the API server to do something. NetworkPolicy controls which Pods can talk to which endpoints. Workload identity controls which cloud APIs a Pod can call. But once the Pod is scheduled, the container process still has a Linux security boundary, filesystem boundary, capability boundary, namespace boundary, and admission-time policy boundary.

This part is about that layer.

The target skill is not memorizing every securityContext field. The target skill is the ability to review a workload and answer:

"If this container is compromised, how far can the attacker move from inside the Pod?"

That is the real production question.

1. The security model in one sentence

A Kubernetes workload is secure only when the requested runtime privileges are no wider than the application needs, and the API server refuses unsafe Pod specs before they reach the node.

That sentence gives us two responsibilities:

Declare a safe runtime contract using Pod/container security context.
Enforce safe defaults and guardrails using admission control.

The key observation: by the time the container starts, it is too late to negotiate safety. Safety must be encoded in spec and enforced at admission.

2. What problem `securityContext` actually solves

A container is not a virtual machine. It is a process running on a node using Linux primitives such as namespaces, cgroups, capabilities, seccomp, AppArmor, SELinux, and filesystem mounts.

securityContext lets you constrain how that process runs.

There are two levels:

apiVersion: v1
kind: Pod
metadata:
  name: example
spec:
  securityContext:          # Pod-level defaults
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: app
      image: example/app:1.0.0
      securityContext:      # Container-level overrides
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: true
        capabilities:
          drop: ["ALL"]

The Pod-level context applies defaults where relevant. The container-level context is usually where you set capability, privilege escalation, and read-only filesystem rules.

A production review should not ask only:

"Does it have a security context?"

A better review asks:

"Which kernel-level powers does this process receive, and why?"

3. The production baseline: restricted-by-default workload

For most stateless applications, start here:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
  namespace: app-orders
spec:
  replicas: 3
  selector:
    matchLabels:
      app.kubernetes.io/name: orders-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: orders-api
    spec:
      automountServiceAccountToken: false
      securityContext:
        runAsNonRoot: true
        runAsUser: 10001
        runAsGroup: 10001
        fsGroup: 10001
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: orders-api
          image: registry.example.com/orders-api:1.4.7
          ports:
            - name: http
              containerPort: 8080
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: cache
              mountPath: /app/cache
          resources:
            requests:
              cpu: "250m"
              memory: "512Mi"
            limits:
              memory: "768Mi"
      volumes:
        - name: tmp
          emptyDir: {}
        - name: cache
          emptyDir: {}

Important details:

runAsNonRoot: true prevents the container from running as UID 0.
runAsUser and runAsGroup make runtime identity explicit.
allowPrivilegeEscalation: false prevents gaining more privileges through mechanisms such as setuid binaries.
capabilities.drop: ["ALL"] removes Linux capabilities by default.
seccompProfile.type: RuntimeDefault uses the container runtime's default syscall filter.
readOnlyRootFilesystem: true makes the application prove its writable paths are explicit.
automountServiceAccountToken: false avoids giving API credentials to Pods that do not need them.
emptyDir mounts create explicit writable scratch space.

This is not perfect security. It is a sane baseline.

4. Container privilege is a blast-radius multiplier

A compromised container can do only what the process is allowed to do. Runtime privileges determine the difference between "application compromise" and "node compromise".

4.1 Dangerous fields

These fields deserve automatic review:

securityContext:
  privileged: true

privileged: true is close to saying: "this container can act like root on the node." It is sometimes necessary for low-level agents, but it should be treated as an exception.

hostNetwork: true
hostPID: true
hostIPC: true

These make the Pod share host namespaces. They are common in networking, monitoring, and node agents, but they collapse isolation.

volumes:
  - name: host
    hostPath:
      path: /var/run/docker.sock

hostPath can expose sensitive node files or sockets. Mounting the container runtime socket is especially dangerous because it can often lead to node control.

securityContext:
  capabilities:
    add:
      - NET_ADMIN
      - SYS_ADMIN

Capabilities are fragments of root privilege. SYS_ADMIN is particularly broad and should almost never be granted to normal app workloads.

5. Security context fields that matter most

5.1 `runAsNonRoot`

Use it for almost every workload:

securityContext:
  runAsNonRoot: true

It tells Kubernetes/container runtime that UID 0 is not acceptable.

Caveat: if the image does not declare a numeric non-root user and you do not set runAsUser, startup can fail. That failure is good. It reveals an image contract problem.

5.2 `runAsUser` and `runAsGroup`

Prefer explicit numeric IDs:

securityContext:
  runAsUser: 10001
  runAsGroup: 10001

Avoid relying on a username in /etc/passwd unless your image is controlled and tested. Numeric IDs make the runtime contract clear.

5.3 `fsGroup`

Use when mounted volumes need group ownership:

securityContext:
  fsGroup: 10001

Be careful with large volumes. Ownership changes can be slow depending on volume type and policy. On high-volume stateful systems, volume permission strategy must be tested during cold start and failover.

5.4 `allowPrivilegeEscalation`

Use this by default:

securityContext:
  allowPrivilegeEscalation: false

It is a simple, high-value control.

5.5 Linux capabilities

Default posture:

securityContext:
  capabilities:
    drop:
      - ALL

If you need one capability, add only that one with a documented reason:

securityContext:
  capabilities:
    drop:
      - ALL
    add:
      - NET_BIND_SERVICE

Even NET_BIND_SERVICE is often unnecessary if the app listens on an unprivileged port such as 8080 and the Service maps port 80 externally.

5.6 `seccompProfile`

Default posture:

securityContext:
  seccompProfile:
    type: RuntimeDefault

Seccomp filters system calls. Runtime defaults are not application-specific perfect profiles, but they are a strong baseline compared with unconfined execution.

5.7 `readOnlyRootFilesystem`

Default posture for stateless APIs:

securityContext:
  readOnlyRootFilesystem: true

This forces application write paths to be explicit.

Typical required writable paths:

/tmp
application cache directory
runtime socket directory
file upload staging directory

Declare them deliberately:

volumeMounts:
  - name: tmp
    mountPath: /tmp
volumes:
  - name: tmp
    emptyDir: {}

6. The real contract: safe workload class by class

Different workload classes need different baselines.

Workload	Expected security posture	Common exception
Stateless API	restricted, non-root, no service account token, no host access	writable `/tmp`, app cache
Worker/consumer	same as stateless API	larger ephemeral storage, cloud identity
Batch Job	restricted, explicit service account if needed	write scratch space
Ingress controller	restricted where possible, but may need elevated networking	host ports depending on implementation
CNI plugin	privileged or host access often required	exception namespace only
CSI node plugin	privileged/host mounts often required	exception namespace only
Observability agent	host access often required	DaemonSet exception with tight RBAC
Security agent	privileged sometimes required	vendor-specific review

The principle:

Application namespaces should be restricted. Platform/system namespaces may have exceptions, but exceptions must be bounded, named, reviewed, and monitored.

7. Pod Security Standards: Privileged, Baseline, Restricted

Kubernetes defines Pod Security Standards as three broad policy levels:

Privileged — unrestricted; intended for trusted system workloads.
Baseline — prevents known privilege escalations while allowing common workloads.
Restricted — heavily restricted; follows hardening best practices.

Use them as an operating model, not just documentation.

A practical environment mapping:

Namespace type	Recommended PSS level
App dev sandbox	`baseline` with `warn/audit: restricted`
App staging	`restricted` enforce, limited exceptions
App production	`restricted` enforce
Platform controllers	`baseline` or privileged exception, case by case
CNI/CSI/system agents	privileged exception namespace
Security tooling	privileged exception only when justified

8. Pod Security Admission

Pod Security Admission is Kubernetes' built-in admission controller for enforcing Pod Security Standards at namespace level.

It works using namespace labels.

Example: enforce restricted in production namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: app-orders
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest

Three modes matter:

Mode	Effect
`enforce`	reject non-compliant Pod creation/update
`warn`	allow request but return warning to client
`audit`	allow request but record audit annotation

8.1 Rollout strategy

Do not turn on enforce: restricted blindly across an existing cluster.

Use this progression:

Start with:

metadata:
  labels:
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Then move to:

metadata:
  labels:
    pod-security.kubernetes.io/enforce: restricted

8.2 Version pinning

latest follows the current Kubernetes version's definition of the standard. That is convenient but can change behavior during upgrades.

For strict production governance, consider pinning:

pod-security.kubernetes.io/enforce-version: v1.34

Then plan standard upgrades deliberately.

The trade-off:

Version strategy	Benefit	Risk
`latest`	automatically tracks latest standard	upgrade may introduce new violations
pinned version	predictable enforcement	may lag behind current hardening

9. Admission control mental model

Admission is the final gate before persistence.

The order matters:

Mutating admission can change the object.
Validating admission sees the object after mutation.
Pod Security Admission is validation-oriented for Pod security standards.
Custom admission webhooks can validate or mutate many resource types.

This creates a powerful but dangerous platform surface.

A broken webhook can block deployments. A careless mutating webhook can hide bad application contracts. A weak validating policy can create false confidence.

10. What Pod Security Admission does not solve

Pod Security Admission is a baseline, not a full policy platform.

It does not fully answer questions like:

Is the image from an approved registry?
Is the image digest pinned?
Does every Deployment define resource requests?
Are only approved Ingress classes used?
Does every namespace have NetworkPolicy?
Are cloud IAM roles allowed only from specific namespaces?
Are labels and ownership metadata complete?
Is this Secret allowed to be mounted by this workload?
Does this workload violate cost policy?

That is why Part 024 covers policy as code.

The model:

Layer	Use for
Security context	runtime process boundary
Pod Security Admission	built-in Pod hardening baseline
ValidatingAdmissionPolicy	native CEL-based validation
Kyverno/Gatekeeper	richer policy as code
Cloud policy	organizational compliance and cloud governance

11. Namespace model for production

A practical production cluster should classify namespaces.

apiVersion: v1
kind: Namespace
metadata:
  name: app-payments-prod
  labels:
    platform.example.com/tier: production
    platform.example.com/owner: payments
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest

Platform namespace with exception:

apiVersion: v1
kind: Namespace
metadata:
  name: platform-observability
  labels:
    platform.example.com/tier: platform
    platform.example.com/owner: sre
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted

System namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: kube-system
  labels:
    platform.example.com/tier: system
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/warn: baseline
    pod-security.kubernetes.io/audit: baseline

This is not permission to be careless in kube-system. It is recognition that some system agents need host-level powers.

12. EKS notes

On EKS, the same Kubernetes-level concepts apply, but there are cloud-specific interactions.

12.1 Privileged DaemonSets are common in platform namespaces

Examples often include:

CNI plugin
CSI node drivers
observability agents
security agents
node-local DNS agents

Do not copy those privileges into application namespaces.

12.2 Service account token exposure matters

EKS workloads often integrate with IAM through EKS Pod Identity or IRSA. A service account token can become the bridge to cloud APIs. That is why automountServiceAccountToken: false is a powerful default for workloads that do not need Kubernetes API or cloud identity.

12.3 Node role blast radius is still relevant

Pod security reduces in-Pod privilege. It does not replace least-privilege node IAM, workload identity, network policy, or IMDS hardening.

12.4 Admission policy is part of platform ownership

EKS does not remove the need for in-cluster policy. In real platforms, teams commonly use Pod Security Admission plus Kyverno or Gatekeeper for custom guardrails.

13. AKS notes

On AKS, the same workload hardening rules apply, with Azure-specific integrations.

13.1 Workload identity and pod security reinforce each other

A Pod with Azure Workload Identity can access Azure resources. The container should still run non-root, without privilege escalation, and with minimal filesystem write surface.

13.2 Azure Policy can enforce Kubernetes guardrails

AKS environments often use Azure Policy for governance at subscription/resource-group/cluster level. Treat it as organizational compliance control, not as a substitute for good workload manifests.

13.3 System and add-on namespaces need exception modeling

AKS-managed add-ons and node-level agents may require elevated privileges. Application namespace policy should be stricter than platform namespace policy.

14. Review workflow for a Deployment

When reviewing a workload, use this order.

Step 1 — Does it need Kubernetes API credentials?

If no:

automountServiceAccountToken: false

If yes:

bind a dedicated ServiceAccount
grant minimal RBAC
grant minimal cloud permissions
avoid using default ServiceAccount

Step 2 — Does it run as root?

Look for:

runAsNonRoot: true
runAsUser: 10001

If it must run as root, require a written exception and compensating controls.

Step 3 — Can it escalate privileges?

Look for:

allowPrivilegeEscalation: false

Step 4 — What capabilities does it have?

Look for:

capabilities:
  drop: ["ALL"]

Question every added capability.

Step 5 — Is the root filesystem writable?

Prefer:

readOnlyRootFilesystem: true

Declare writable paths explicitly.

Step 6 — Does it use host access?

Review:

hostNetwork: true
hostPID: true
hostIPC: true
hostPath: ...

App workloads should almost never need these.

Step 7 — Does namespace policy enforce the same expectation?

Check namespace labels:

kubectl get ns app-orders --show-labels

15. Debugging Pod Security failures

Symptom: Deployment created but Pods do not start

Check ReplicaSet events:

kubectl describe rs -n app-orders

Check namespace labels:

kubectl get ns app-orders --show-labels

Check events:

kubectl get events -n app-orders --sort-by=.lastTimestamp

You may see that Pod creation is forbidden by Pod Security Admission.

Symptom: app crashes after enabling read-only root filesystem

Look for write attempts:

/tmp
logs written to files instead of stdout
local cache directories
framework-generated files
JVM temp directory
Python/Node package cache

Fix by mounting explicit writable paths:

volumes:
  - name: tmp
    emptyDir: {}
containers:
  - name: app
    volumeMounts:
      - name: tmp
        mountPath: /tmp

Symptom: app cannot bind to port 80

Better fix: run app on 8080 and expose Service port 80.

ports:
  - name: http
    containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: orders-api
spec:
  ports:
    - name: http
      port: 80
      targetPort: 8080

Avoid adding NET_BIND_SERVICE unless there is a real reason.

Symptom: volume permission denied

Check UID/GID:

kubectl exec -n app-orders deploy/orders-api -- id

Check mount ownership:

kubectl exec -n app-orders deploy/orders-api -- ls -lah /data

Potential fixes:

set fsGroup
use init container to prepare permissions, if allowed
fix image user/group
use storage class/CSI options where available

Be careful: init containers running as root can bypass your runtime baseline if not controlled.

16. Exception design

Exceptions are inevitable. Hidden exceptions are dangerous.

A good exception has:

owner
reason
scope
expiry
compensating controls
review date
specific namespace/workload match
audit visibility

Example annotation:

metadata:
  annotations:
    security.platform.example.com/exception-id: SEC-2026-0412
    security.platform.example.com/exception-expiry: "2026-10-01"
    security.platform.example.com/exception-reason: "Vendor node security agent requires hostPID and privileged mode"

Do not allow permanent anonymous exceptions.

17. Failure modes

17.1 Enforcing restricted on system namespaces

Impact: CNI/CSI/observability agents fail to start.

Prevention:

classify namespaces
test admission labels in staging
avoid blanket namespace labeling

17.2 Read-only root filesystem breaks runtime behavior

Impact: app starts crashing after security hardening.

Prevention:

run app with read-only root filesystem in CI/staging
declare writable paths explicitly
inspect framework temp/cache paths

17.3 Non-root user cannot access mounted volume

Impact: app fails on startup or cannot persist data.

Prevention:

test volume ownership
use fsGroup deliberately
validate stateful failover path

17.4 PSA blocks Pods created by controllers

Impact: Deployment exists, but ReplicaSet cannot create Pods.

Prevention:

validate rendered manifests before apply
inspect events, not just Deployment status

17.5 Security context exists but is meaningless

Example:

securityContext:
  runAsNonRoot: true

But container also has:

hostPath:
  path: /var/run/containerd/containerd.sock

Prevention:

review full Pod spec
enforce host access policy
use policy-as-code for high-risk fields

17.6 Default service account token exposed everywhere

Impact: compromised app can call Kubernetes API with default namespace ServiceAccount permissions.

Prevention:

set automountServiceAccountToken: false by default
create dedicated ServiceAccounts only when needed
bind minimal RBAC

18. Production checklist

A production app namespace should satisfy:

19. Practice lab

Lab 1 — Harden a weak Deployment

Start with this unsafe workload:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: unsafe-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: unsafe-api
  template:
    metadata:
      labels:
        app: unsafe-api
    spec:
      containers:
        - name: api
          image: nginx:latest
          ports:
            - containerPort: 80

Tasks:

Replace mutable image tag with a pinned version or digest.
Run as non-root.
Disable privilege escalation.
Drop all capabilities.
Use RuntimeDefault seccomp.
Make root filesystem read-only.
Add writable /tmp if needed.
Disable service account token automount.
Put it in a restricted namespace.

Lab 2 — Enable Pod Security Admission progressively

Create namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: security-lab
  labels:
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Apply unsafe Pod and observe warnings/events. Then switch to:

pod-security.kubernetes.io/enforce: restricted

Try again. Explain exactly why it fails.

Lab 3 — Build an exception namespace

Create a namespace for a fake node agent that needs elevated permissions.

Requirements:

allow privileged workload only in that namespace
label owner
add exception annotations
document why app namespaces must not copy the pattern

20. Key takeaways

securityContext is the runtime privilege contract for container processes.
Pod Security Standards provide a shared vocabulary: privileged, baseline, restricted.
Pod Security Admission enforces those standards at namespace level.
App namespaces should be restricted by default.
Platform namespaces may need exceptions, but exceptions must be explicit and audited.
Admission control prevents unsafe specs from becoming cluster state.
Runtime hardening does not replace RBAC, workload identity, NetworkPolicy, image security, or cloud IAM. It complements them.

References

Kubernetes Documentation — Security Context: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
Kubernetes Documentation — Pod Security Standards: https://kubernetes.io/docs/concepts/security/pod-security-standards/
Kubernetes Documentation — Pod Security Admission: https://kubernetes.io/docs/concepts/security/pod-security-admission/
Kubernetes Documentation — Admission Controllers: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/
Kubernetes Documentation — RBAC Authorization: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
AWS EKS Best Practices — Security: https://docs.aws.amazon.com/eks/latest/best-practices/security.html
Azure AKS Best Practices — Pod Security: https://learn.microsoft.com/en-us/azure/aks/developer-best-practices-pod-security

Lesson Recap

You just completed lesson 23 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 22

AKS Workload Identity and Managed Identity

Next Lesson

Lesson 24

Policy as Code with OPA, Kyverno, and Cloud Policy

Security Context, Pod Security, and Admission

Part 023 — Security Context, Pod Security, and Admission

1. The security model in one sentence

2. What problem securityContext actually solves

3. The production baseline: restricted-by-default workload

4. Container privilege is a blast-radius multiplier

4.1 Dangerous fields

5. Security context fields that matter most

5.1 runAsNonRoot

5.2 runAsUser and runAsGroup

5.3 fsGroup

5.4 allowPrivilegeEscalation

5.5 Linux capabilities

5.6 seccompProfile

5.7 readOnlyRootFilesystem

6. The real contract: safe workload class by class

7. Pod Security Standards: Privileged, Baseline, Restricted

8. Pod Security Admission

8.1 Rollout strategy

8.2 Version pinning

9. Admission control mental model

10. What Pod Security Admission does not solve

11. Namespace model for production

12. EKS notes

12.1 Privileged DaemonSets are common in platform namespaces

12.2 Service account token exposure matters

12.3 Node role blast radius is still relevant

12.4 Admission policy is part of platform ownership

13. AKS notes

13.1 Workload identity and pod security reinforce each other

13.2 Azure Policy can enforce Kubernetes guardrails

13.3 System and add-on namespaces need exception modeling

14. Review workflow for a Deployment

Step 1 — Does it need Kubernetes API credentials?

Step 2 — Does it run as root?

Step 3 — Can it escalate privileges?

Step 4 — What capabilities does it have?

Step 5 — Is the root filesystem writable?

Step 6 — Does it use host access?

Step 7 — Does namespace policy enforce the same expectation?

15. Debugging Pod Security failures

Symptom: Deployment created but Pods do not start

Symptom: app crashes after enabling read-only root filesystem

Symptom: app cannot bind to port 80

Symptom: volume permission denied

16. Exception design

17. Failure modes

17.1 Enforcing restricted on system namespaces

17.2 Read-only root filesystem breaks runtime behavior

17.3 Non-root user cannot access mounted volume

17.4 PSA blocks Pods created by controllers

17.5 Security context exists but is meaningless

17.6 Default service account token exposed everywhere

18. Production checklist

19. Practice lab

Lab 1 — Harden a weak Deployment

Lab 2 — Enable Pod Security Admission progressively

Lab 3 — Build an exception namespace

20. Key takeaways

References

2. What problem `securityContext` actually solves

5.1 `runAsNonRoot`

5.2 `runAsUser` and `runAsGroup`

5.3 `fsGroup`

5.4 `allowPrivilegeEscalation`

5.6 `seccompProfile`

5.7 `readOnlyRootFilesystem`