Series/Learn Kubernetes, Deployment Model, and Cloud Native Platform Engineering

Final StretchOrdered learning track

GitOps Delivery Model: Declarative Operations at Scale

Learn Kubernetes, Deployment Model, and Cloud Native Platform Engineering - Part 030

GitOps delivery model for Kubernetes, including declarative desired state, source of truth, pull-based reconciliation, drift detection, promotion strategy, environment topology, secret handling, policy integration, rollback, multi-cluster delivery, and enterprise operating model.

[2026-07-01]22 min read4346 words

In This Lesson

1. Why This Part Exists 2. Kaufman Skill Target 3. GitOps Is Not Just “Put YAML in Git”

PrevNext

Lesson 3035 lesson track30–35 Final Stretch

#kubernetes#gitops#delivery#argocd+6 more

Part 030 — GitOps Delivery Model: Declarative Operations at Scale

1. Why This Part Exists

Kubernetes is declarative.

But many teams still operate it imperatively:

kubectl apply -f production.yaml
helm upgrade --install app ./chart
kubectl edit deployment checkout-api

That works for learning and emergencies.

It does not scale well for regulated, multi-team, multi-cluster, auditable production platforms.

The GitOps mental model is:

Git stores the desired state. A controller continuously reconciles the cluster toward that desired state. Humans and pipelines change Git, not the cluster directly.

GitOps matters because Kubernetes already runs on reconciliation. GitOps extends that reconciliation boundary outward:

Git desired state -> GitOps controller -> Kubernetes API -> Kubernetes controllers -> actual workload state

This part teaches GitOps as an operating model, not as a tool tutorial.

You should finish this part able to design:

repository structure
application definitions
promotion strategy
environment separation
secret handling
policy gates
drift management
rollback model
multi-cluster topology
emergency procedure
platform/team ownership boundary

2. Kaufman Skill Target

The skill target for this part is:

Given several Kubernetes services, environments, and clusters, design a GitOps delivery model that makes desired state auditable, promotion explicit, drift detectable, rollback practical, and platform guardrails enforceable without turning the platform team into a deployment bottleneck.

This requires seven sub-skills:

Declarative state design — define what belongs in Git and at what abstraction level.
Repository topology — organize manifests, overlays, apps, clusters, and ownership boundaries.
Reconciliation reasoning — understand sync, drift, health, pruning, and failure modes.
Promotion modelling — move changes across environments deliberately.
Security and policy integration — protect secrets, RBAC, admission, and approvals.
Rollback and incident handling — recover using Git history and controller behavior.
Scale governance — support many teams and clusters without chaos.

3. GitOps Is Not Just “Put YAML in Git”

Weak GitOps:

We keep Kubernetes YAML in a repository.

Strong GitOps:

The repository is the authoritative desired state. Changes are reviewed, versioned, immutable after merge, automatically pulled by an agent, continuously reconciled, observable, and governed.

OpenGitOps defines four core principles:

Principle	Practical Meaning
Declarative	Desired state is expressed declaratively.
Versioned and immutable	Desired state is stored with version history and immutability.
Pulled automatically	Agents pull desired state from the source.
Continuously reconciled	Agents compare actual state to desired state and attempt convergence.

In Kubernetes terms:

GitOps = external desired-state reconciliation around Kubernetes desired-state reconciliation.

4. GitOps Control Loop

The important boundary:

CI builds artifacts. GitOps deploys desired state.

A clean model:

CI builds image.
CI tests image.
CI signs/provenances image.
CI updates desired state repository or opens a PR.
GitOps controller pulls approved desired state.
Kubernetes reconciles workload.
Observability validates runtime behavior.

Avoid giving CI broad cluster-admin access just to deploy.

5. Push-Based CD vs Pull-Based GitOps

Dimension	Push-Based CD	Pull-Based GitOps
Actor	CI/CD system pushes to cluster.	In-cluster or platform agent pulls from Git.
Cluster credentials	Often stored in CI.	Mostly held by GitOps controller.
Drift detection	Usually separate/manual.	Core controller capability.
Audit trail	Pipeline logs + Git.	Git history + controller events.
Failure mode	Pipeline may partially apply and exit.	Controller keeps reconciling or reports out-of-sync.
Multi-cluster	CI needs access to many clusters.	Each cluster can pull its assigned state.
Security posture	Larger external credential surface.	Smaller ingress credential surface, but controller becomes critical.

Push-based deployment is not automatically bad.

But at scale, pull-based reconciliation usually gives better:

auditability
blast-radius control
drift visibility
environment consistency
credential isolation
cluster autonomy

6. Tooling Landscape

Common Kubernetes GitOps tools:

Tool	Model	Notes
Argo CD	Application-centric controller and UI.	Strong visualization, sync/health, app projects, multi-cluster, broad ecosystem.
Flux CD	Toolkit of controllers.	Strong Kubernetes-native composability, source/kustomize/helm/image automation controllers.
Fleet / Rancher GitOps	Fleet management model.	Useful in Rancher-centered environments.
OpenShift GitOps	Argo CD-based distribution.	Integrated with OpenShift ecosystem.

This series does not prescribe one tool.

The invariant is:

The tool must implement a safe reconciliation model aligned with your organization boundaries.

7. Argo CD Mental Model

Argo CD is implemented as a Kubernetes controller. It watches desired application state from Git and compares it with live cluster state.

Core concepts:

Concept	Meaning
Application	A desired-state unit mapped from source repo/path/chart to destination cluster/namespace.
AppProject	Boundary for allowed repos, destinations, namespaces, and resources.
Sync	Apply desired state to target cluster.
Health	Resource-level interpretation of runtime health.
OutOfSync	Live state differs from desired Git state.
Prune	Delete live resources no longer in desired state.
Sync waves/hooks	Order complex rollouts or supporting resources.
Auto-sync	Controller applies changes automatically.

Argo CD is good when you need:

UI for app/platform visibility
manual and automatic sync modes
application health view
RBAC and multi-tenancy through projects
multi-cluster app management
clear operational workflow

8. Flux Mental Model

Flux is a set of controllers that reconcile sources and Kubernetes resources.

Common concepts:

Concept	Meaning
GitRepository	Source artifact from Git.
HelmRepository	Source artifact from Helm repo.
OCIRepository	Source artifact from OCI registry.
Kustomization	Reconcile a path/artifact into cluster.
HelmRelease	Reconcile Helm chart release.
Image automation	Detect image updates and update Git.

Flux is strong when you want:

Kubernetes-native composition
controller-per-responsibility architecture
Git/OCI/Helm source automation
less UI-centered operation
fine-grained reconciliation primitives

9. What Belongs in Git?

A common mistake is putting either too little or too much in Git.

9.1 Good Candidates

Put these in Git:

namespaces
RBAC
NetworkPolicies
ResourceQuotas
LimitRanges
Deployments
StatefulSets
Services
Ingress/Gateway routes
ConfigMaps without secrets
ExternalSecret definitions
sealed/encrypted secret objects if using that model
Helm values
Kustomize overlays
policy definitions
monitoring rules
dashboards as code
Argo CD Applications or Flux Kustomizations
tenant/platform abstractions

9.2 Usually Not in Plain Git

Avoid plain-text Git storage for:

raw secret values
private keys
long-lived credentials
generated runtime status
high-churn generated objects
large binary artifacts
node-specific ephemeral data

9.3 Sometimes in Git, Carefully

Be careful with:

certificate material
database migration jobs
one-time break-glass resources
emergency patches
generated Helm chart lockfiles
environment-specific replica counts
feature flags
tenant-specific overrides

The guiding question:

Is this desired state that should be reviewed, versioned, reconciled, and recoverable?

If yes, Git is a good candidate.

10. Repository Topologies

There is no universal repo structure. There are trade-offs.

10.1 Mono-Repo for Platform State

platform-live/
  clusters/
    prod-eu-1/
    prod-us-1/
    staging-1/
  apps/
    checkout/
    payment/
  infrastructure/
    ingress/
    cert-manager/
    observability/
  policies/
  tenants/

Pros:

one global view
simple cross-cutting changes
centralized governance
easier dependency ordering

Cons:

repo permissions can become hard
noisy PRs
merge contention
teams may feel blocked
blast radius of mistaken change can be large

Good for:

smaller platform teams
regulated environments
high governance needs
central platform ownership

10.2 App Repo Owns App Manifests

checkout-service/
  src/
  Dockerfile
  deploy/
    base/
    overlays/
      dev/
      staging/
      prod/

Pros:

app team owns deployment config
changes close to source code
easier service-specific review
developer autonomy

Cons:

platform standards can drift
cross-service consistency harder
app repos need Kubernetes maturity
environment promotion can get messy

Good for:

mature service teams
lower central bottleneck
strong templates/policies

10.3 Separate App Source and Environment State

checkout-service/
  src/
  Dockerfile

platform-live/
  environments/
    dev/checkout.yaml
    staging/checkout.yaml
    prod/checkout.yaml

Pros:

application code and deployment state separated
production change control centralized
promotion is explicit
easier audit for environment state

Cons:

two-repo workflow
PR automation needed
developers may lose context

Good for:

enterprise production controls
regulated systems
platform teams providing paved roads

10.4 App-of-Apps / Root Application

root-app/
  clusters/prod/apps.yaml
  clusters/prod/platform.yaml

The root GitOps object points to child applications.

Pros:

bootstrap-friendly
hierarchical ownership
cluster composition visible

Cons:

dependency ordering must be explicit
large root apps can become fragile
accidental prune risk if not managed carefully

11. Recommended Enterprise Shape

For complex organizations, a clean shape is often:

service-repo       -> source code, tests, Dockerfile, app-local defaults
artifact-registry  -> signed images, SBOMs, provenance
platform-catalog   -> reusable deployment templates/golden paths
platform-live      -> environment/cluster desired state
secrets-manager    -> actual secret values
policy-repo        -> admission and compliance policies

Flow:

This separates concerns:

Concern	Owner
Source code	App team
Build and artifact	App team + platform controls
Deployment template	Platform team
Environment state	App/platform depending on governance
Secret value	Security/platform/application owner
Policy	Security/platform
Runtime operation	App + platform shared

12. Promotion Models

Promotion is how desired state moves from one environment to another.

12.1 Branch-Based Promotion

main -> staging branch -> prod branch

Pros:

easy to understand
branch protections can map to environment controls

Cons:

cherry-pick complexity
drift between branches
merges can be unclear

12.2 Directory-Based Promotion

environments/
  dev/
  staging/
  prod/

Promotion is a PR changing prod files to match staging.

Pros:

environment state visible side-by-side
simple audit
good for Kustomize/Helm values

Cons:

PR automation needed
duplicated values if poorly factored

12.3 Tag or Commit Pinning

image: registry.example.com/checkout@sha256:abc...

or:

sourceRevision: abc123

Pros:

immutable releases
strong auditability
rollback to known commit/digest

Cons:

more automation needed
humans dislike digest-heavy diffs

12.4 Environment Controller Promotion

A promotion controller updates environment state after checks pass.

Pros:

scalable
consistent
integrates SLO gates

Cons:

more platform engineering complexity
controller bugs become delivery bugs

12.5 Recommended Baseline

For production:

Promote immutable artifact references through PRs to environment-specific desired state.

Do not rebuild images per environment.

Build once. Promote the same artifact.

13. Image Update Strategies

Weak pattern:

image: checkout-api:latest

Strong pattern:

image: registry.example.com/checkout-api@sha256:4f3c...

or at least:

image: registry.example.com/checkout-api:1.42.7

Digest pinning gives precise identity.

Trade-off:

tags are human-readable
digests are immutable and audit-friendly

A common production compromise:

image: registry.example.com/checkout-api:1.42.7@sha256:4f3c...

13.1 Who Updates the Image?

Options:

Actor	Pattern
CI pipeline	Build image, open PR updating manifest.
Image automation controller	Detect new image, update Git.
Release manager	Manually promote specific image.
Deployment platform	Promote after policy/SLO gates.

For regulated systems, automatic merge to production is often too aggressive. Automatic PR creation plus approval gates is safer.

14. Sync Policy

GitOps controllers usually support manual or automated sync.

14.1 Manual Sync

Pros:

human control
easier during early adoption
good for high-risk apps

Cons:

drift can persist
humans become bottleneck
emergency changes may bypass Git

14.2 Auto Sync

Pros:

Git merge means deployment
drift repaired automatically
better consistency

Cons:

bad Git commit deploys quickly
pruning mistakes can be dangerous
requires strong pre-merge validation

14.3 Auto Sync with Guardrails

Recommended mature model:

auto-sync for lower environments
auto-sync for low-risk services
manual or gated sync for high-risk production services
automated validation before merge
progressive delivery for risky changes
emergency pause capability

14.4 Prune Policy

Prune means deleting live resources that no longer exist in Git.

Prune is powerful and dangerous.

Rules:

enable prune only when repo ownership is clean
use resource exclusions carefully
avoid broad apps that own too much
label resources clearly
test deletion in non-prod
protect namespaces and CRDs separately

The failure mode:

A bad commit removes a directory. GitOps prunes production resources.

Design against this.

15. Drift Management

Drift means live state differs from desired state.

Drift can be:

Drift Type	Example
Emergency drift	On-call patched replicas during incident.
Human drift	Someone used `kubectl edit`.
Controller drift	HPA changes replicas.
Defaulting drift	API server defaults fields.
Mutating webhook drift	Admission injected sidecar or labels.
Runtime status drift	Status fields naturally change.
External controller drift	cert-manager updates Secret/cert status.

Not all drift is bad.

The skill is to distinguish:

managed desired-state drift vs expected controller-owned runtime change

15.1 Ignore Differences

GitOps tools often allow ignoring specific diffs.

Use this for:

fields owned by HPA
status fields
injected sidecars
generated annotations
cert-manager-generated fields

Do not use ignore rules to hide real ownership confusion.

15.2 Emergency Drift Procedure

If someone must patch production directly:

Record reason in incident channel.
Apply minimal patch.
Pause GitOps sync if needed.
Open PR to reconcile Git with intended post-incident state.
Resume GitOps.
Close incident action item only when drift is resolved.

Emergency patching is allowed. Invisible persistent drift is not.

16. Secrets in GitOps

GitOps and secrets require careful design.

Bad pattern:

apiVersion: v1
kind: Secret
metadata:
  name: db-password
data:
  password: cGFzc3dvcmQ=

Base64 is not encryption.

16.1 Common Secret Models

Model	Description	Trade-off
External Secrets	Git stores reference; secret manager stores value.	Strong separation, operational dependency on secret manager.
Sealed Secrets	Git stores encrypted Secret decryptable by cluster controller.	Simple Git workflow, controller/key management critical.
SOPS + KMS	Git stores encrypted YAML; decrypted by controller/tool with KMS.	Strong and flexible, needs key management discipline.
CSI Secret Store	Secrets mounted from external store at runtime.	Good for runtime mount, may not create native Secret unless configured.
Manual Secret	Operator creates Secret outside Git.	Simple, but poor audit and drift unless tightly controlled.

16.2 Recommended Baseline

For enterprise platforms:

Store secret values in a dedicated secret manager. Store only secret references and access policy in Git.

Example:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: checkout-db
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: platform-secret-store
    kind: ClusterSecretStore
  target:
    name: checkout-db
  data:
  - secretKey: password
    remoteRef:
      key: prod/checkout/db
      property: password

Governance questions:

Who can change secret reference?
Who can change secret value?
Who can grant workload access?
How is rotation tested?
How is revocation performed?
How are secret reads audited?

17. Policy Integration

GitOps should integrate with policy at two levels:

Pre-merge policy — catch bad desired state before it enters Git.
Admission policy — block unsafe state from entering the cluster.

17.1 Pre-Merge Checks

Run:

schema validation
Kubernetes server-side dry-run against representative cluster
policy tests
image signature/provenance checks
resource request checks
forbidden capability checks
NetworkPolicy baseline checks
ownership label checks
diff preview

Example pipeline:

PR -> render -> validate -> policy test -> security scan -> preview diff -> approve -> merge

17.2 Admission Enforcement

Admission protects against:

direct kubectl bypass
compromised CI
mistaken GitOps controller permission
stale repo validation
unknown clients

Pre-merge checks improve developer feedback. Admission is the final guardrail.

You need both.

18. RBAC and Multi-Tenancy

GitOps controllers need permissions to apply resources.

The dangerous shortcut:

Give GitOps cluster-admin everywhere.

Sometimes a bootstrap controller needs broad power. But application-level GitOps should be constrained.

18.1 Permission Boundaries

Boundary options:

Boundary	Description
Namespace per team	Controller/app can manage only team namespace.
AppProject / project boundary	Restrict repo, destination, namespace, resource kinds.
Cluster-scoped platform app	Only platform repo manages CRDs, ClusterRoles, admission, ingress classes.
Tenant app	Tenant repo manages Deployments, Services, ConfigMaps inside namespace.
Separate controller per tenant	Stronger isolation, more overhead.

18.2 Resource Kind Segmentation

App teams usually should not freely manage:

ClusterRole
ClusterRoleBinding
ValidatingWebhookConfiguration
MutatingWebhookConfiguration
CRD
StorageClass
IngressClass/GatewayClass
PriorityClass
Namespace labels controlling Pod Security

They may manage, under policy:

Deployment
StatefulSet in approved patterns
Service
ConfigMap
ExternalSecret reference
HPA
PDB
NetworkPolicy in their namespace
HTTPRoute attached to approved Gateway

19. Environment Strategy

Environment design is a source of hidden complexity.

19.1 Common Models

Model	Shape	Strength	Weakness
Namespace per environment	One cluster, many env namespaces.	Cheap, simple.	Weak isolation.
Cluster per environment	dev/staging/prod clusters.	Better isolation.	More platform overhead.
Cluster per region	prod-us, prod-eu.	Regional blast-radius control.	Promotion complexity.
Cluster per tenant	Strong isolation.	Expensive and operationally heavy.
Ephemeral preview env	PR creates temporary env.	Great feedback.	Needs automation and cleanup.

19.2 Production Recommendation

For serious systems:

Use separate production clusters or node pools where failure, compliance, or tenant isolation demands it. Use namespaces for logical organization, not as the only hard security boundary.

GitOps should make environment boundaries explicit.

20. Multi-Cluster GitOps

Multi-cluster delivery introduces fleet concerns.

Questions:

Does each cluster pull its own desired state?
Is there one central GitOps control plane managing many clusters?
How are cluster credentials stored?
What happens if central GitOps is down?
How do you roll out platform changes by wave?
How do you prevent global bad commits?
How do you handle regional differences?

20.1 Topologies

Central Controller

Pros:

centralized UI/control
easy fleet view
consistent policy

Cons:

central controller is high-value target
cluster credentials centralized
failure can affect many clusters

Per-Cluster Controller

Pros:

cluster autonomy
smaller blast radius
no central credential concentration

Cons:

harder global visibility
more controllers to manage
policy consistency requires discipline

20.2 Wave-Based Rollout

For fleet changes:

wave 0: dev/internal cluster
wave 1: staging cluster
wave 2: one low-risk production cluster
wave 3: 25% production fleet
wave 4: remaining production fleet

Each wave needs:

health gates
stop condition
rollback/fail-forward path
owner approval
time for observation

21. Rollback in GitOps

GitOps rollback means reverting desired state.

Options:

Method	Use Case
Git revert commit	Most auditable default.
Revert image digest/tag	Fast application rollback.
Revert Helm values	Config rollback.
Argo CD rollback to previous app revision	Operational shortcut, must reconcile with Git history.
Disable auto-sync temporarily	Incident control, not final state.
Roll forward with fix	When state/data/API migration prevents rollback.

The core invariant:

After incident recovery, Git must represent the desired production state.

Otherwise, the next reconciliation may reintroduce the incident.

21.1 Rollback Is Not Always Safe

Rollback may fail when:

database migration is not backward-compatible
API/event contract changed incompatibly
CRD schema migrated
StatefulSet storage format changed
external dependency changed
feature flag state changed
traffic policy changed outside Git

Therefore GitOps rollback must be combined with deployment compatibility design from earlier parts.

22. Progressive Delivery with GitOps

GitOps does not replace progressive delivery.

It coordinates desired state. Progressive delivery controls exposure and promotion.

Common pattern:

Git change -> GitOps sync -> Rollout controller starts canary -> metrics analysis -> promotion or rollback

With Argo Rollouts, Flagger, service mesh, Gateway API, or ingress integrations, GitOps can declare rollout strategy while another controller manages traffic progression.

Example shape:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: checkout-api
spec:
  strategy:
    canary:
      steps:
      - setWeight: 10
      - pause: {duration: 10m}
      - setWeight: 50
      - pause: {duration: 20m}

The important separation:

Layer	Responsibility
GitOps	Desired rollout object exists.
Rollout controller	Gradually changes exposure.
Metrics system	Supplies objective health signals.
Policy	Blocks unsafe changes.
Human/platform	Defines risk appetite.

23. GitOps Failure Modes

23.1 Bad Commit Syncs Everywhere

Cause:

shared base changed without wave rollout
auto-sync enabled globally
insufficient validation

Mitigation:

wave rollout
branch/path protections
environment-specific approval
policy tests
progressive delivery
app segmentation

23.2 GitOps Controller Deletes Resources

Cause:

prune enabled
directory removed
ownership boundary too broad

Mitigation:

narrow application scope
careful prune enablement
protect critical resources
app-of-apps review
dry-run/diff preview

23.3 Drift Hidden by Ignore Rules

Cause:

broad ignoreDifferences
unclear field ownership

Mitigation:

ignore only expected controller-owned fields
review ignore rules
add ownership documentation

23.4 Secret Decryption Failure

Cause:

KMS/key issue
sealed secret controller key lost
external secret store unavailable
RBAC changed

Mitigation:

key backup/rotation procedure
secret-store SLO
bootstrap recovery docs
alerting on sync failures

23.5 Git Provider Outage

Cause:

GitHub/GitLab/internal Git unavailable

Mitigation:

controllers continue with last known desired state where possible
avoid requiring Git for steady-state runtime
know cache behavior
avoid unnecessary resync during outage

23.6 Controller Compromise

Cause:

GitOps controller has broad cluster credentials

Mitigation:

least privilege
separate controllers/projects
admission policy
signed commits/artifacts
audit logs
network isolation

24. Bootstrap Problem

How does GitOps manage the cluster before GitOps exists?

This is bootstrap.

Options:

Method	Description
Manual install	Install GitOps controller once, then hand over to Git.
Terraform/bootstrap pipeline	Infrastructure tool creates cluster and installs GitOps.
Cluster API	Cluster lifecycle plus GitOps bootstrap.
Immutable cluster image	Pre-baked baseline for specialized environments.

Bootstrap should be minimal:

Create cluster.
Install GitOps controller.
Point controller to platform desired-state repo.
GitOps installs everything else.

The anti-pattern:

Half the cluster is managed by Terraform, half by Helm manually, half by GitOps.

Yes, that is three halves. That is how it feels operationally.

24.1 Ownership Split with Terraform

A common clean boundary:

Terraform Owns	GitOps Owns
Cloud network	Kubernetes namespaces
Cluster resource	Workloads
Node pools	RBAC within cluster
IAM roles	NetworkPolicy
Managed databases	ExternalSecret references
DNS zones base	Ingress/Gateway routes
Initial GitOps install	Platform add-ons after bootstrap

Do not let Terraform and GitOps fight over the same Kubernetes object.

25. Application Definition Example

A simple Argo CD Application shape:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: checkout-api-prod
  namespace: argocd
spec:
  project: payments
  source:
    repoURL: https://git.example.com/platform/platform-live.git
    targetRevision: main
    path: environments/prod/apps/checkout-api
  destination:
    server: https://kubernetes.default.svc
    namespace: checkout-prod
  syncPolicy:
    automated:
      prune: false
      selfHeal: true
    syncOptions:
    - CreateNamespace=false

Design notes:

project should restrict allowed destinations and resources.
targetRevision for production may be branch, tag, or commit depending on governance.
prune should not be enabled casually.
namespace creation may be owned by platform, not app.
source path should be narrow enough to avoid accidental ownership expansion.

26. Repository Example

platform-live/
  clusters/
    prod-jakarta-1/
      root.yaml
      platform/
        ingress-controller.yaml
        cert-manager.yaml
        external-secrets.yaml
        observability.yaml
        policies.yaml
      tenants/
        payments.yaml
        enforcement.yaml
      apps/
        checkout-api.yaml
        case-management-api.yaml
    staging-jakarta-1/
      root.yaml
  environments/
    prod/
      apps/
        checkout-api/
          kustomization.yaml
          deployment.yaml
          service.yaml
          httproute.yaml
          hpa.yaml
          pdb.yaml
          externalsecret.yaml
          networkpolicy.yaml
    staging/
      apps/
        checkout-api/
          kustomization.yaml
  policies/
    baseline/
    restricted/
  README.md

This structure makes cluster composition and app desired state explicit.

27. GitOps for Regulated Systems

In regulated or enforcement lifecycle systems, GitOps is attractive because it provides:

change history
review evidence
separation of duties
reproducible desired state
audit trail
rollback trail
environment promotion record
policy enforcement
reduced direct production access

But the platform must also handle:

emergency access records
approval mapping
incident exceptions
data migration evidence
configuration provenance
tenant impact analysis
retention requirements

A production change record can include:

# Production Deployment Record

- Service:
- Image digest:
- Git commit:
- PR:
- Approvers:
- Policy checks:
- Security checks:
- Migration included: yes/no
- Rollback plan:
- SLO dashboard:
- Deployment window:
- Incident link if emergency:

GitOps makes this easier, not automatic.

28. GitOps Adoption Strategy

Do not migrate everything at once.

Phase 1 — Observe

install GitOps in non-prod
manage one low-risk app
use manual sync
learn diff/health/prune behavior

Phase 2 — Standardize

define repo structure
define labels/annotations
define AppProject/RBAC model
add pre-merge validation
add policy checks

Phase 3 — Expand

onboard more apps
add auto-sync for non-prod
integrate secrets model
define promotion workflow
build dashboards

Phase 4 — Productionize

production app onboarding
incident procedure
SLO-based rollback/promotion
audit evidence
multi-cluster model
platform catalog integration

Phase 5 — Optimize

self-service onboarding
golden paths
progressive delivery
fleet waves
drift analytics
cost/security/reliability guardrails

29. GitOps Readiness Checklist

Before production GitOps:

Is the source of truth clearly defined?
Are direct cluster changes restricted?
Is repo ownership clear?
Are CODEOWNERS or equivalent approvals configured?
Are manifests rendered and validated before merge?
Are policies tested before merge and enforced at admission?
Are secrets handled safely?
Are GitOps controller permissions bounded?
Are app/platform resource boundaries clear?
Is prune strategy safe?
Are drift rules reviewed?
Is rollback procedure tested?
Is emergency patch procedure documented?
Are sync failures alerted?
Are app health checks meaningful?
Is promotion explicit?
Are production changes auditable?
Is multi-cluster blast radius controlled?

30. Practical Lab: Build a Minimal GitOps Operating Model

Use a non-production cluster.

Step 1 — Define Desired State Repo

gitops-lab/
  apps/
    hello-api/
      deployment.yaml
      service.yaml
      kustomization.yaml
  clusters/
    local/
      hello-api-application.yaml

Step 2 — Add Application Manifest

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-api
  labels:
    app.kubernetes.io/name: hello-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: hello-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: hello-api
    spec:
      containers:
      - name: app
        image: registry.k8s.io/echoserver:1.10
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /
            port: 8080

Step 3 — Reconcile Through GitOps Tool

Install Argo CD or Flux in the lab cluster, then point it to the repo/path.

Observe:

sync status
health status
live object diff
events
controller logs

Step 4 — Create Drift

kubectl scale deployment hello-api --replicas=5

Observe whether GitOps reports drift and whether self-heal restores the desired state.

Step 5 — Change Git

Update replicas in Git:

replicas: 3

Merge and observe reconciliation.

Step 6 — Write Learning Note

Answer:

What is the source of truth?
What drift was detected?
What changed automatically?
What required human approval?
What would be dangerous in production?
What guardrail would you add first?

31. Common Anti-Patterns

31.1 GitOps Without Ownership

A repo without ownership becomes shared mutable infrastructure soup.

31.2 GitOps With Cluster-Admin Everywhere

This centralizes risk and weakens tenant isolation.

31.3 Auto-Prune Too Early

Prune should come after ownership boundaries are proven.

31.4 Ignoring Rendered Output

Helm/Kustomize source is not enough. Validate rendered manifests.

31.5 Environment Drift by Copy-Paste

If dev/staging/prod diverge unintentionally, promotion becomes guesswork.

31.6 Secret Values in Plain Git

Base64 is not encryption.

31.7 Manual Hotfix Never Reconciled Back

Emergency patches must become Git state or be intentionally reverted.

31.8 GitOps as a Platform Bottleneck

If every app change requires platform engineers to edit YAML, GitOps has become centralized ops with better logs.

31.9 No Stop Button

Every auto-sync system needs a safe pause/emergency procedure.

31.10 No Runtime Feedback Loop

GitOps can deploy broken desired state perfectly. You still need metrics, SLOs, and progressive delivery.

32. Summary

GitOps is the operating model that aligns Kubernetes with Git-based change control.

The core ideas:

Git is the desired-state source of truth.
Controllers pull and reconcile state continuously.
CI builds artifacts; GitOps deploys desired state.
Promotion should move immutable artifacts across environments.
Drift must be visible and intentionally managed.
Secrets require a dedicated strategy.
Pre-merge policy and admission policy solve different problems.
Controller RBAC must match tenant/platform boundaries.
Multi-cluster GitOps needs wave rollout and blast-radius design.
Rollback is Git history plus compatibility discipline.
GitOps does not replace observability, progressive delivery, or incident response.

The mature stance is:

GitOps is not YAML in Git. It is auditable, reconciled, policy-governed desired-state operation.

33. References

OpenGitOps — Principles: https://opengitops.dev/
Argo CD Documentation — Declarative GitOps CD for Kubernetes: https://argo-cd.readthedocs.io/en/stable/
Argo CD Documentation — Declarative Setup: https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/
Flux Documentation: https://fluxcd.io/flux/
Kubernetes Documentation — Declarative Management of Kubernetes Objects: https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/
Kubernetes Documentation — Server-Side Apply: https://kubernetes.io/docs/reference/using-api/server-side-apply/
Kubernetes Documentation — Secrets: https://kubernetes.io/docs/concepts/configuration/secret/
External Secrets Operator Documentation: https://external-secrets.io/

Lesson Recap

You just completed lesson 30 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 29

Kubernetes Upgrades, Version Skew, and Compatibility Management

Next Lesson

Lesson 31

Helm, Kustomize, and Kubernetes Package Management