Series/Learn Java Microservices Design and Architect

Series MapLesson 69 / 100

Deepen PracticeOrdered learning track

Team Topology and Service Ownership

Learn Java Microservices Design and Architect - Part 069

Team topology and service ownership for Java microservices: ownership boundaries, stream-aligned teams, platform teams, cognitive load, RACI, on-call, service catalog metadata, CODEOWNERS, and operating model review.

[2026-07-05]27 min read5288 words

In This Lesson

1. Core Idea 2. What This Part Is Not 3. Mental Model: A Service Is a Socio-Technical Unit

PrevNext

Lesson 69100 lesson track55–82 Deepen Practice

#java#microservices#architecture#team-topology+6 more

Part 069 — Team Topology and Service Ownership

1. Core Idea

Microservices are not primarily a packaging technique.

They are an ownership architecture.

A service boundary is only useful when it maps to a boundary where a team can:

understand the domain,
change the code,
deploy independently,
operate production,
own the data contract,
handle incidents,
evolve the API,
retire the service,
and be accountable for business outcomes.

If no team owns the service end-to-end, the system is not a healthy microservice architecture.

It is only a distributed codebase.

The test is simple:

When this service fails at 02:00, who has the authority, context, and responsibility to decide what happens next?

If the answer is vague, ownership is broken.

2. What This Part Is Not

This part is not a generic management chapter.

It is an architecture chapter.

Team topology directly affects:

service boundary,
API shape,
deployment independence,
incident response,
data ownership,
cognitive load,
platform design,
governance cost,
and failure recovery.

A microservice that crosses ownership boundaries too often will become slow to change, hard to debug, and politically expensive.

A service owned by a team that cannot operate it will become fragile.

A service owned by everyone will eventually be owned by no one.

3. Mental Model: A Service Is a Socio-Technical Unit

A production microservice has two sides.

3.1 Technical side

code repository,
runtime deployment,
database/schema,
API contract,
event contract,
metrics,
logs,
traces,
secrets,
CI/CD pipeline,
infrastructure manifest,
test suite,
runbook.

owning team,
domain experts,
operational escalation path,
product owner,
security contact,
data steward,
consumer teams,
platform support boundary,
governance reviewers.

Architecture fails when those two sides do not match.

Example:

Team A owns code.
Team B owns database.
Team C owns Kubernetes manifests.
Team D owns API gateway policy.
Team E receives alerts.
Team F approves schema changes.

This is not end-to-end ownership.

It is a distributed bottleneck machine.

4. Ownership Is Not Just “Who Wrote the Code”

A mature service owner owns the service across its lifecycle.

Ownership Area	What It Means
Business capability	The team understands what business outcome the service supports.
Code	The team can safely change the service implementation.
API contract	The team owns compatibility, documentation, and consumer communication.
Data authority	The team owns write rules, data quality, lineage, and deletion semantics.
Runtime	The team owns deployment, capacity, configuration, and runtime behavior.
Reliability	The team owns SLOs, alert response, runbooks, and incident learning.
Security	The team owns threat model, secrets, identity, and authorization assumptions.
Cost	The team understands the service's resource and platform cost profile.
Lifecycle	The team owns birth, maturity, deprecation, and retirement.

A weak ownership model says:

“The service belongs to the payments team.”

A strong ownership model says:

“The Payments Capability Team owns the Payment Authorization Service, including API compatibility, payment-state data authority, SLOs, incident response, consumer communication, rollout risk, and deprecation policy.”

That is a different level of accountability.

5. Ownership Topology Should Match Change Topology

A service boundary should exist where change can happen mostly independently.

Ask:

Who requests changes to this capability?
Who understands the business rules?
Who has authority to resolve conflicting requirements?
Who receives incident impact?
Who negotiates API changes with consumers?
Who can deploy without waiting for another team?

If changes always require three teams to coordinate, the service boundary is probably wrong.

If one team owns five tightly coupled services that always deploy together, those services may be modules, not independent microservices.

If one service contains rules owned by many teams, the service may be a boundary violation.

6. Mermaid: Ownership Topology vs Runtime Topology

The diagram is not only about calls.

It shows who owns authority.

A good architecture diagram should make both runtime dependency and ownership visible.

7. Team Types in a Microservices Organization

A healthy Java microservices organization usually contains several team shapes.

The labels matter less than the responsibility boundaries.

7.1 Stream-aligned team

A stream-aligned team owns a business value stream or business capability.

Examples:

Case Intake Team,
Payment Authorization Team,
Claims Assessment Team,
Enforcement Lifecycle Team,
Customer Onboarding Team.

This team should own services that represent business capability.

It should be able to:

talk directly to domain experts,
prioritize feature work,
own production behavior,
respond to incidents,
evolve APIs,
and deploy independently.

Most business microservices should belong here.

7.2 Platform team

A platform team builds internal capabilities that reduce cognitive load for stream-aligned teams.

Examples:

Java service template,
CI/CD golden path,
observability platform,
secrets platform,
Kubernetes runtime,
service catalog,
deployment automation,
approved libraries,
self-service database provisioning.

A platform team should not become a ticket queue for every deployment.

The platform should provide paved roads.

The stream-aligned team should still own the service.

7.3 Enabling team

An enabling team helps other teams learn a capability.

Examples:

helping teams adopt OpenTelemetry,
improving threat modeling practice,
teaching resilience testing,
helping migrate to a new service template,
improving architecture review quality.

The enabling team should not permanently own the service.

Its job is to increase capability, then move on.

7.4 Complicated-subsystem team

Some parts of the system require deep specialist knowledge.

Examples:

search relevance engine,
fraud scoring model,
rules compiler,
cryptographic signing service,
high-throughput matching engine,
domain-specific optimization solver.

A complicated-subsystem team may own such services or components.

But this should be used carefully.

If every service is considered “too complex” for stream-aligned teams, the organization has not reduced cognitive load. It has centralized knowledge.

8. Service Ownership Patterns

8.1 Good pattern: one accountable owning team

One team owns the service end-to-end.

Other teams may contribute, but accountability is clear.

This is the default pattern.

8.2 Good pattern: platform-owned platform service

A platform service may be owned by a platform team.

Examples:

internal identity broker,
deployment service,
observability collector,
service catalog,
config distribution service.

But platform services should expose a product-like interface.

They need:

documentation,
SLOs,
consumer support channel,
compatibility policy,
migration guidance,
operational transparency.

A platform service without a product interface becomes hidden infrastructure bureaucracy.

8.3 Risky pattern: shared service ownership

Two teams “jointly own” one service.

This often means:

unclear roadmap,
mixed domain language,
slow approval,
ownership gaps,
alert confusion,
conflicting priorities.

Shared ownership can work only when responsibility is explicitly split by surface.

Example:

Surface	Owner
Business rules	Team A
Runtime platform	Platform Team
Database operation	Database Platform Team
Incident commander	Team A
Product roadmap	Team A

The service owner remains Team A.

The platform team owns platform capability, not business service behavior.

8.4 Bad pattern: orphan service

An orphan service has no active owner.

Symptoms:

last meaningful commit was years ago,
alerts route to a generic channel,
nobody knows the data contract,
upgrades are risky,
dependencies are stale,
consumers are afraid to change it,
incident mitigation is manual folklore.

An orphan service is a liability.

It needs one of three outcomes:

assign owner,
merge into another service,
retire.

8.5 Bad pattern: committee-owned service

A committee-owned service requires approval from many teams for routine change.

This often happens with “central common service” designs.

Examples:

common customer service,
common notification service,
common workflow service,
common document service,
common audit service.

Common services are not always bad.

But they must have a clear product boundary.

If every domain change requires changing the common service, it is not common. It is a shared bottleneck.

9. The Ownership Rule of Thumb

Use this rule:

The team that owns the business invariant should own the service that enforces it.

Example:

If the rule is:

“A regulatory case cannot be escalated before minimum evidence requirements are satisfied.”

Then the team responsible for case lifecycle policy should own the service enforcing that invariant.

Not the API platform team.

Not the database team.

Not a generic workflow team.

The workflow platform may provide execution primitives.

The case lifecycle team owns the business rule.

10. RACI for Service Ownership

RACI is often abused, but it is useful for making ambiguity visible.

For a microservice, define:

Responsible: does the work,
Accountable: owns the outcome,
Consulted: gives input before decisions,
Informed: receives notification.

Example:

Activity	Owning Team	Platform Team	Security Team	Data Governance	Consumer Team
Change business rule	A/R	C	C	C	I
Change API contract	A/R	C	C	I	C
Deploy service	A/R	C	I	I	I
Respond to incident	A/R	C	C	I	I
Rotate service secret	A/R	C	C	I	I
Change Kubernetes base template	C	A/R	C	I	I
Define SLO	A/R	C	C	C	C
Approve data retention policy	R	I	C	A	I
Deprecate endpoint	A/R	I	I	I	C

Important:

There should be exactly one accountable owner for each service.

Multiple consulted stakeholders are normal.

Multiple accountable owners usually means no owner.

11. Ownership Contract

Every production service should have an ownership contract.

This does not need to be a long document.

It should be discoverable, versioned, and close to the code.

Example:

service: case-lifecycle-service
owner:
  team: case-lifecycle-team
  slack: '#team-case-lifecycle'
  escalation: pagerduty-case-lifecycle
  product_owner: case-lifecycle-product-owner
  tech_lead: case-lifecycle-tech-lead

purpose:
  capability: regulatory-case-lifecycle-management
  summary: Owns case lifecycle state transitions, escalation rules, and lifecycle audit evidence.

authority:
  owns_data:
    - case_lifecycle_state
    - case_escalation_history
    - case_assignment_state
  owns_apis:
    - POST /cases/{caseId}/submit
    - POST /cases/{caseId}/escalate
    - GET /cases/{caseId}/lifecycle
  owns_events:
    - CaseSubmitted
    - CaseEscalated
    - CaseClosed

operations:
  tier: tier-1
  slo:
    availability: 99.9
    p95_latency_ms: 300
  runbook: docs/runbooks/case-lifecycle.md
  dashboard: https://observability.example.com/case-lifecycle
  alert_policy: docs/alerts/case-lifecycle-alerts.md

lifecycle:
  state: production
  introduced: 2026-03-01
  last_reviewed: 2026-07-05
  next_review_due: 2026-10-05

This contract prevents architectural amnesia.

12. Service Catalog as Ownership Infrastructure

A service catalog is not a pretty portal.

It is ownership infrastructure.

At minimum, it should answer:

What does this service do?
Who owns it?
What system does it belong to?
What APIs/events does it expose?
What data does it own?
What dependencies does it have?
What lifecycle state is it in?
Where are dashboards, logs, traces, runbooks, and alerts?
What is the escalation path?
What is the service tier?
What is the SLO?
What is the deprecation policy?

If engineers cannot discover this in minutes, the organization does not have real service ownership.

13. Example: Backstage-style `catalog-info.yaml`

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: case-lifecycle-service
  description: Owns regulatory case lifecycle transitions, escalation, closure, and lifecycle audit events.
  tags:
    - java
    - microservice
    - regulatory
    - tier-1
  annotations:
    github.com/project-slug: example/case-lifecycle-service
    backstage.io/techdocs-ref: dir:.
    pagerduty.com/service-id: PCASE123
spec:
  type: service
  lifecycle: production
  owner: group:case-lifecycle-team
  system: regulatory-case-management
  providesApis:
    - case-lifecycle-rest-api
  consumesApis:
    - evidence-summary-api
    - policy-evaluation-api
  dependsOn:
    - resource:case-lifecycle-postgres
    - component:evidence-service
    - component:decision-policy-service
---
apiVersion: backstage.io/v1alpha1
kind: API
metadata:
  name: case-lifecycle-rest-api
  description: REST API for case lifecycle commands and query views.
spec:
  type: openapi
  lifecycle: production
  owner: group:case-lifecycle-team
  system: regulatory-case-management
  definition:
    $text: ./openapi/case-lifecycle.yaml

The catalog is not the source of architectural truth by itself.

It is an index into living truth:

code,
documentation,
contracts,
telemetry,
ownership,
runtime.

14. Ownership in the Repository

A Java microservice repository should make ownership visible.

Recommended files:

case-lifecycle-service/
  catalog-info.yaml
  OWNERS.md
  CODEOWNERS
  README.md
  docs/
    adr/
    runbooks/
    operations/
    api/
    events/
    security/
  src/main/java/...
  src/test/java/...
  deploy/
  .github/workflows/

Example CODEOWNERS:

# Default owner
* @org/case-lifecycle-team

# API contract requires API review
/openapi/** @org/case-lifecycle-team @org/api-governance

# Deployment manifests require platform visibility
/deploy/** @org/case-lifecycle-team @org/platform-runtime

# Security-sensitive config requires security review
/deploy/secrets/** @org/case-lifecycle-team @org/security-engineering

# ADRs are owned by the service team, but architecture group is consulted
/docs/adr/** @org/case-lifecycle-team @org/architecture-review

Do not use CODEOWNERS as a replacement for team accountability.

It is only a routing mechanism.

15. Ownership and Cognitive Load

Microservices create local autonomy but increase total system complexity.

A team can own only so much.

Cognitive load includes:

business complexity,
code complexity,
runtime complexity,
operational complexity,
security complexity,
data complexity,
dependency complexity,
compliance complexity,
incident complexity,
migration complexity.

A team that owns too many services will stop owning them deeply.

Symptoms:

runbooks are stale,
alerts are ignored,
upgrades are deferred,
consumers wait too long,
service catalog is incomplete,
team cannot explain failure modes,
incident response depends on one senior engineer,
no one knows which service owns a business rule.

Service count is not the only measure.

One high-risk service may consume more cognitive load than ten simple services.

16. Cognitive Load Budget Example

Service	Domain Complexity	Runtime Complexity	Operational Criticality	Security/Compliance	Cognitive Load
case-lifecycle-service	High	Medium	High	High	Very high
evidence-metadata-service	Medium	Medium	High	High	High
notification-template-service	Low	Low	Medium	Medium	Low
reference-data-service	Medium	Low	Medium	Medium	Medium
document-rendering-service	Low	High	Medium	Medium	Medium-high

A team owning all of these may be overloaded.

Better options:

split ownership by capability,
move common runtime concerns to platform,
reduce service count by merging low-autonomy services,
create an enabling engagement,
improve service template and automation,
retire unused services.

17. Platform Team Boundary

A common failure mode:

“The platform team owns microservices.”

That is wrong.

The platform team owns platform capabilities.

The service team owns service behavior.

Concern	Stream-Aligned Team	Platform Team
Business rule	Owns	Does not own
API behavior	Owns	Provides gateway tooling
Service code	Owns	Provides template/scaffolding
Runtime configuration	Owns service config	Provides config mechanism
Deployment pipeline	Owns service pipeline usage	Provides pipeline platform
Kubernetes manifests	Owns service-specific intent	Provides base charts/policies
Observability instrumentation	Owns semantic telemetry	Provides telemetry platform
Incident response	Owns service incident	Supports platform-related incident
Secret rotation	Owns service credential usage	Provides secret-management mechanism
SLO	Owns user-facing SLO	Provides SLO tooling
Cost	Owns service cost decisions	Provides cost visibility/guardrails

A strong platform team reduces friction without stealing accountability.

18. Platform as Product

Internal platform capabilities should have product qualities.

A platform capability should define:

target users,
supported use cases,
self-service flow,
paved-road defaults,
escape hatch,
support model,
SLO,
compatibility policy,
deprecation policy,
documentation,
examples,
adoption metrics.

Bad platform:

“Open a ticket and wait.”

Good platform:

“Generate a Java service from the template, get CI/CD, logging, metrics, tracing, health checks, deployment manifests, SLO dashboard, and service catalog entry by default.”

The platform should make the right path cheaper than the wrong path.

19. Service Ownership and On-Call

If a team owns production behavior, it must participate in operational responsibility.

This does not mean every team must run an identical 24/7 pager model.

But it means every production service needs:

escalation owner,
incident triage process,
response expectations,
runbook,
alert routing,
platform escalation path,
severity definition,
rollback/mitigation authority,
post-incident review ownership.

A service that pages nobody is unmanaged risk.

A service that pages the wrong team is organizational debt.

20. Alert Ownership

Alert ownership should follow mitigation authority.

Ask:

Who can decide to disable a feature flag?
Who can scale the service?
Who can rollback?
Who can disable a consumer?
Who can change rate limits?
Who can repair data?
Who understands the business impact?

That team should own or be first responder for the alert.

If the platform team receives every service alert, the platform team becomes the bottleneck and loses context.

The platform team should receive alerts for platform capability failures.

Service teams should receive alerts for service behavior failures.

21. Ownership and Data Authority

Data ownership is one of the clearest tests of service ownership.

For each important data element, define:

authoritative service,
owning team,
allowed writers,
read exposure method,
deletion owner,
retention owner,
quality owner,
reconciliation owner,
privacy owner,
audit evidence owner.

Example:

Data	Authoritative Service	Owning Team	Allowed Writers	Exposure
case lifecycle state	case-lifecycle-service	Case Lifecycle Team	case-lifecycle-service only	REST/query event
evidence metadata	evidence-service	Evidence Team	evidence-service only	API/event/read model
policy decision result	decision-policy-service	Decision Policy Team	decision-policy-service only	event/API
notification delivery status	notification-service	Messaging Team	notification-service only	event/query API

If nobody owns data quality, the data will decay.

If multiple services write the same business state, incidents become archaeology.

22. Ownership and API Compatibility

The API owner must own compatibility.

This includes:

schema evolution,
endpoint lifecycle,
error contract,
deprecation notice,
consumer migration,
backward compatibility tests,
contract test maintenance,
release notes,
production monitoring for deprecated usage.

A service team cannot say:

“Consumers should just update.”

Microservices only scale when producers and consumers can evolve independently.

That requires compatibility discipline.

23. Consumer Relationship Model

Every important consumer should be classified.

Consumer Type	Relationship	Required Discipline
Internal same team	Direct collaboration	Fast coordination, still versioned.
Internal different team	API-as-product	Compatibility, docs, support channel.
Platform consumer	Self-service	Strong docs and automation.
External partner	Formal contract	Versioning, security, SLA, approval.
Regulatory/reporting consumer	Evidence contract	Auditability, lineage, retention.

Not all consumers require the same process.

But every consumer requires an explicit relationship.

24. Mermaid: Service Ownership Lifecycle Touchpoints

Ownership is not assigned once.

It must be maintained.

25. Ownership Review Cadence

Every production service should be periodically reviewed.

Recommended cadence:

Service Tier	Ownership Review
Tier 0 / critical platform	Monthly or after major incident
Tier 1 / critical business	Quarterly
Tier 2 / important internal	Every 6 months
Tier 3 / low-criticality	Annually
Deprecated service	Monthly until retired

Review should answer:

Is the owner still correct?
Is the service still needed?
Is the catalog current?
Are runbooks current?
Are SLOs meaningful?
Are alerts actionable?
Are dependencies stale?
Are consumers known?
Is cost justified?
Is the service still aligned with business capability?

26. Java-Specific Ownership Implications

In Java systems, ownership often decays through shared libraries and shared frameworks.

Watch these carefully.

26.1 Shared domain library smell

Bad:

common-domain.jar
  Case.java
  Evidence.java
  Decision.java
  EnforcementAction.java

This creates shared domain ownership across services.

Better:

each service owns its own domain model,
shared library contains only stable technical utilities,
integration uses explicit contracts,
shared kernel is rare and governance-heavy.

26.2 Platform starter ownership

Spring Boot starters or internal Java libraries can be useful.

But define ownership clearly.

Example:

platform-observability-starter
platform-security-starter
platform-database-starter
platform-http-client-starter

Owned by platform team.

Business services consume them.

But business teams still own:

span naming,
business metrics,
authorization decisions,
transaction boundaries,
API semantics.

A platform starter should not hide business decisions.

26.3 Internal framework risk

A custom internal framework can reduce boilerplate.

It can also create organizational lock-in.

Before building one, ask:

Does it reduce cognitive load?
Does it preserve service autonomy?
Can teams upgrade incrementally?
Does it support observability and debugging?
Does it expose escape hatches?
Does platform own support and compatibility?
Does it create hidden coupling between services?

Internal frameworks are products.

Treat them that way.

27. Ownership Metadata Exposed at Runtime

For internal services, consider exposing non-sensitive ownership metadata through actuator/info-style endpoints.

Example:

{
  "service": "case-lifecycle-service",
  "owner": "case-lifecycle-team",
  "tier": "tier-1",
  "lifecycle": "production",
  "version": "2026.07.05.1830",
  "gitCommit": "9f3a21c",
  "runbook": "https://internal.example.com/runbooks/case-lifecycle",
  "dashboard": "https://internal.example.com/dashboards/case-lifecycle"
}

This helps during incidents.

Do not expose sensitive details publicly.

Keep external attack surface minimal.

28. Ownership Smells

28.1 “Ask Platform” smell

Every question routes to the platform team.

Usually means service ownership is weak.

28.2 “Only one person knows” smell

Critical knowledge sits in one engineer’s head.

This is a bus-factor and incident-response risk.

28.3 “Generic shared service” smell

A common service collects unrelated business rules.

Usually becomes a change bottleneck.

28.4 “No consumer list” smell

The team cannot identify consumers.

Breaking changes become dangerous.

28.5 “No data owner” smell

Nobody owns data quality or correction.

Reporting, audit, and reconciliation degrade.

28.6 “Operations by another team” smell

One team builds, another team runs, nobody learns.

The feedback loop is broken.

28.7 “Service exists because repo exists” smell

The service no longer maps to an active capability.

It may be a retirement candidate.

29. Designing Ownership for a Regulatory Case Platform

Suppose we are designing a regulatory enforcement platform.

Possible services:

case-intake-service,
case-lifecycle-service,
evidence-service,
decision-policy-service,
enforcement-action-service,
audit-evidence-service,
notification-service,
reporting-read-model-service.

Bad ownership model:

All services owned by Backend Team.
Platform owns deployment.
Database team owns schema.
Ops team owns alerts.
Security owns authorization.

This looks organized but creates disconnected responsibility.

Better ownership model:

Service	Owning Team	Business Authority	Data Authority
case-intake-service	Intake Team	Intake eligibility and submission	intake submission records
case-lifecycle-service	Lifecycle Team	case state transitions and SLA	lifecycle state
evidence-service	Evidence Team	evidence metadata and retention	evidence metadata
decision-policy-service	Policy Team	decision rules and explainability	policy decision result
enforcement-action-service	Enforcement Team	action issuance and tracking	enforcement action state
audit-evidence-service	Compliance Platform Team	audit event capture and retention	audit evidence store
notification-service	Communication Platform Team	delivery mechanics	delivery state
reporting-read-model-service	Reporting Team	reporting views	reporting projection

Note:

audit-evidence-service may be platform-like, but its product is compliance evidence.

Its ownership must still be explicit.

30. Team API

Teams also have APIs.

A team API defines how other teams interact with the team.

For a service-owning team, define:

support channel,
consumer onboarding process,
API change request flow,
incident escalation path,
review SLA,
documentation location,
office hours,
deprecation notice period,
emergency contact,
ownership escalation.

This reduces random interruptions.

It also makes collaboration predictable.

Example:

# Case Lifecycle Team API

## Owns
- case-lifecycle-service
- case lifecycle state model
- case transition rules
- lifecycle audit events

## Support
- Slack: #team-case-lifecycle-support
- Office hours: Tuesday/Thursday 10:00-11:00
- Incident escalation: PagerDuty service `case-lifecycle`

## API Change Requests
Open an issue using template `api-change-request`.
Include consumer use case, expected traffic, error handling expectation, and required rollout date.

## Deprecation Policy
Minimum 90 days for internal consumers unless security or regulatory emergency.

## Emergency Changes
For Sev-1 production impact, contact incident commander and team on-call.

31. Ownership and Architecture Review

Architecture review should not only inspect diagrams.

It should inspect ownership.

Ask:

Who owns the service?
Who owns the business invariant?
Who owns the data authority?
Who owns the API compatibility policy?
Who owns incident response?
Who owns SLO definition?
Who owns deprecation?
Who owns cost?
Who owns security posture?
Who owns runbooks?
Who owns consumer communication?
Who has authority to make emergency decisions?

If these questions are not answered, the design is incomplete.

32. Ownership Decision Matrix

Use this matrix when deciding the owner of a service.

Question	If Yes	Ownership Implication
Does one team own most business rules?	Yes	That team is likely owner.
Does service enforce cross-domain policy?	Yes	Consider policy team or platform-product team.
Is service generic runtime capability?	Yes	Platform team may own.
Does service require deep specialist knowledge?	Yes	Complicated-subsystem team may own.
Does service change with one value stream?	Yes	Stream-aligned team should own.
Does service only aggregate views?	Yes	BFF/composition owner depends on user journey.
Does service store authoritative business state?	Yes	Owner must own data quality and correction.
Does service require 24/7 operational response?	Yes	Owner must support escalation.
Does service have many unrelated consumers?	Yes	Treat API as product; maybe platform/service product team.
Does no team understand it?	Yes	It is an ownership incident.

33. Anti-Pattern: Service per Developer

A common overreaction:

“Each developer owns a microservice.”

This is usually unhealthy.

Problems:

no team redundancy,
poor review quality,
inconsistent operations,
single-person failure domain,
fragmented domain understanding,
high coordination cost,
weak platform discipline.

Microservices should map to teams or stable ownership groups, not individual preferences.

Individual stewardship can exist.

Team accountability must remain.

34. Anti-Pattern: Team per Technical Layer

Layer-based teams create distributed monoliths.

Example:

API Team,
Service Team,
Database Team,
UI Team,
DevOps Team,
QA Team.

Every feature crosses teams.

This makes independent deployment and ownership difficult.

Better:

team owns a vertical capability,
platform provides self-service infrastructure,
specialist teams enable or own truly specialized subsystems.

Microservices are strongest when teams own vertical slices of business behavior.

35. Architecture Diagram with Ownership Overlay

When reviewing architecture, annotate ownership.

Then ask:

Are dependencies aligned with team collaboration patterns?
Which dependency is most operationally risky?
Which team owns the user journey?
Which team is paged for partial failure?
Which dependency crosses a high-friction organizational boundary?

A dependency between services is also a dependency between teams.

36. Measuring Ownership Health

Track ownership health as part of platform intelligence.

Possible metrics:

Metric	Meaning
Services without owner	Direct ownership risk.
Services without runbook	Operational readiness gap.
Services without SLO	Reliability ambiguity.
Services without recent review	Governance drift.
Deprecated services still receiving traffic	Retirement failure.
Services with unknown consumers	Compatibility risk.
Services with stale dependencies	Security/maintenance risk.
Alerts routed to generic channel	Incident ownership gap.
Services with one committer in 12 months	Bus-factor risk.
Services outside golden path	Platform fragmentation.

These metrics should not be used to punish teams.

They should reveal system risk.

37. Ownership Fitness Functions

Automate basic ownership checks.

Examples:

# Fails CI if service has no catalog file
test -f catalog-info.yaml

# Fails CI if runbook is missing
test -f docs/runbooks/production.md

# Fails CI if owner field is missing
yq '.spec.owner' catalog-info.yaml | grep -v null

# Fails CI if lifecycle is not defined
yq '.spec.lifecycle' catalog-info.yaml | grep -E 'experimental|production|deprecated|retired'

More advanced checks:

API owner must match catalog owner.
PagerDuty service must exist for tier-1 service.
Dashboard link must resolve.
Runbook must be modified within review window.
Deprecated APIs must publish deprecation date.
Service must have dependency ownership map.

Governance should be automated where possible.

Manual review should focus on judgment, not checklist archaeology.

38. Ownership Review Checklist

Use this checklist for every production service.

Identity

Service has one accountable owning team.
Service purpose is clear.
Business capability is identified.
Lifecycle state is defined.
Service tier is defined.

Runtime

Deployment owner is clear.
Rollback/roll-forward authority is clear.
Runtime dashboard exists.
Health checks are documented.
Capacity owner is clear.

API and consumers

Data

Authoritative data is listed.
Allowed writers are defined.
Data correction process exists.
Retention owner is defined.
Privacy-sensitive fields are classified.

Operations

Security

Threat model exists or risk is accepted.
Secrets owner is clear.
Authorization assumptions are documented.
Access review cadence exists.

Lifecycle

Review date is current.
Retirement path is known if service becomes obsolete.
Cost ownership exists.

39. Practical Implementation Plan

If your organization has weak service ownership, do not start by reorganizing everyone.

Start with visibility.

Step 1: Inventory services

Find all deployed services.

Sources:

Kubernetes deployments,
service mesh registry,
API gateway routes,
Git repositories,
CI/CD pipelines,
DNS entries,
observability metrics,
cloud resources.

Step 2: Assign current owner

Even if imperfect, assign one temporary accountable owner.

Unknown owner is worse.

Step 3: Add minimum catalog metadata

Required fields:

service name,
owner,
system,
lifecycle,
tier,
repository,
runbook,
dashboard,
pager/escalation,
APIs/events,
dependencies.

Step 4: Identify orphan and high-risk services

Prioritize:

tier-1 services without owner,
services with production traffic and no runbook,
deprecated services with traffic,
services with stale dependencies,
services with unknown consumers.

Step 5: Fix ownership topology gradually

Options:

assign to correct capability team,
split service,
merge service,
retire service,
move platform capability to platform team,
create enabling engagement.

Do not solve ownership with a spreadsheet only.

Ownership must change how work flows.

40. The Top 1% Engineer's View

A senior engineer sees microservices as code.

A staff-level engineer sees them as runtime systems.

A top-tier architect sees them as socio-technical control points.

They ask:

Who owns this decision?
Who can change this safely?
Who gets paged?
Who owns the invariant?
Who owns the data?
Who can retire this?
Who pays the complexity cost?
Which team interaction does this dependency create?

The diagram is not complete until the ownership model is visible.

41. Exercises

Exercise 1: Ownership inventory

Pick one production service.

Write:

owning team,
service purpose,
authoritative data,
APIs/events,
known consumers,
SLO,
runbook,
alert route,
lifecycle state,
cost owner.

If any field is unknown, mark it as architectural risk.

Exercise 2: Ownership mismatch

Find a service where code owner, data owner, and incident owner differ.

Draw the current model.

Then propose a better ownership model.

Exercise 3: Platform boundary

Take one responsibility currently handled by a platform team.

Classify it:

platform capability,
business service behavior,
shared governance,
temporary enabling work.

Then define who should own it long-term.

Exercise 4: Consumer map

For one API, list all consumers.

For each consumer:

owning team,
traffic level,
criticality,
contract version,
migration contact,
error-handling expectation.

If consumers are unknown, your API compatibility risk is high.

42. Key Takeaways

Microservices are ownership architecture, not only deployment units.
A service without one accountable team is operational debt.
Ownership includes code, data, API, runtime, security, reliability, cost, and lifecycle.
Platform teams should reduce cognitive load, not absorb business accountability.
Team topology must match business capability and change topology.
A service catalog is ownership infrastructure.
On-call and alert ownership should follow mitigation authority.
Shared ownership must be explicit or it becomes no ownership.
Architecture review should inspect ownership, not only component diagrams.
Governance should be automated for objective checks and human for judgment.

43. References

Team Topologies — Key Concepts: https://teamtopologies.com/key-concepts
Backstage Software Catalog: https://backstage.io/docs/features/software-catalog/
AWS Prescriptive Guidance — Service per team pattern: https://docs.aws.amazon.com/prescriptive-guidance/latest/modernization-decomposing-monoliths/service-per-team.html
Google SRE Book — Production Readiness Review / SRE Engagement Model: https://sre.google/sre-book/evolving-sre-engagement-model/
Martin Fowler — Microservices: https://martinfowler.com/articles/microservices.html

Lesson Recap

You just completed lesson 69 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 68

Multi-Region and Disaster Recovery Design

Next Lesson

Lesson 70

Service Lifecycle Governance

Team Topology and Service Ownership

Part 069 — Team Topology and Service Ownership

1. Core Idea

2. What This Part Is Not

3. Mental Model: A Service Is a Socio-Technical Unit

3.1 Technical side

3.2 Social side

4. Ownership Is Not Just “Who Wrote the Code”

5. Ownership Topology Should Match Change Topology

6. Mermaid: Ownership Topology vs Runtime Topology

7. Team Types in a Microservices Organization

7.1 Stream-aligned team

7.2 Platform team

7.3 Enabling team

7.4 Complicated-subsystem team

8. Service Ownership Patterns

8.1 Good pattern: one accountable owning team

8.2 Good pattern: platform-owned platform service

8.3 Risky pattern: shared service ownership

8.4 Bad pattern: orphan service

8.5 Bad pattern: committee-owned service

9. The Ownership Rule of Thumb

10. RACI for Service Ownership

11. Ownership Contract

12. Service Catalog as Ownership Infrastructure

13. Example: Backstage-style catalog-info.yaml

14. Ownership in the Repository

15. Ownership and Cognitive Load

16. Cognitive Load Budget Example

17. Platform Team Boundary

18. Platform as Product

19. Service Ownership and On-Call

20. Alert Ownership

21. Ownership and Data Authority

22. Ownership and API Compatibility

23. Consumer Relationship Model

24. Mermaid: Service Ownership Lifecycle Touchpoints

25. Ownership Review Cadence

26. Java-Specific Ownership Implications

26.1 Shared domain library smell

26.2 Platform starter ownership

26.3 Internal framework risk

27. Ownership Metadata Exposed at Runtime

28. Ownership Smells

28.1 “Ask Platform” smell

28.2 “Only one person knows” smell

28.3 “Generic shared service” smell

28.4 “No consumer list” smell

28.5 “No data owner” smell

28.6 “Operations by another team” smell

28.7 “Service exists because repo exists” smell

29. Designing Ownership for a Regulatory Case Platform

30. Team API

31. Ownership and Architecture Review

32. Ownership Decision Matrix

33. Anti-Pattern: Service per Developer

34. Anti-Pattern: Team per Technical Layer

35. Architecture Diagram with Ownership Overlay

36. Measuring Ownership Health

37. Ownership Fitness Functions

38. Ownership Review Checklist

Identity

Runtime

API and consumers

Data

Operations

Security

Lifecycle

39. Practical Implementation Plan

Step 1: Inventory services

Step 2: Assign current owner

Step 3: Add minimum catalog metadata

Step 4: Identify orphan and high-risk services

Step 5: Fix ownership topology gradually

40. The Top 1% Engineer's View

41. Exercises

Exercise 1: Ownership inventory

Exercise 2: Ownership mismatch

Exercise 3: Platform boundary

Exercise 4: Consumer map

13. Example: Backstage-style `catalog-info.yaml`