Start HereOrdered learning track

Learn Ai Coding Agent Part 006 Use Case Selection And Risk Classification

[]17 min read3324 words

In This Lesson

1. Prinsip utama: pilih use case yang membangun trust 2. Use case lane: autonomous, supervised, draft-only, blocked 3. Dimensi pemilihan use case

PrevNext

Lesson 0664 lesson track01–12 Start Here

title: Learn AI Coding Agent From Scratch - Part 006 description: Memilih use case awal untuk AI coding agent dan membuat risk classification yang menentukan autonomous lane, supervised lane, draft-only lane, atau blocked lane. series: learn-ai-coding-agent seriesTitle: Learn AI Coding Agent From Scratch order: 6 partTitle: Use Case Selection and Risk Classification tags:

ai-coding-agent
risk-classification
use-case-selection
software-maintenance
automation-strategy
governance date: 2026-07-03

Part 006 — Use Case Selection and Risk Classification

Part sebelumnya menetapkan domain: kita membangun controlled code-change automation system. Sekarang kita harus memilih pekerjaan pertama yang layak diautomasi.

Ini keputusan arsitektural, bukan hanya product choice. Use case pertama akan menentukan:

bentuk task contract;
tool yang harus tersedia;
verifier yang dibutuhkan;
risiko yang harus dikontrol;
data evaluasi awal;
bagaimana developer menilai sistem;
apakah platform mendapat trust atau langsung dianggap PR spammer.

Tujuan part ini:

Membuat framework pemilihan use case dan risk classification agar agent tidak diberi pekerjaan yang terlalu kabur, terlalu berbahaya, atau terlalu sulit diverifikasi pada tahap awal.

Referensi faktual yang relevan:

Spotify Engineering menunjukkan Honk dipakai untuk large-scale software maintenance dan PR workflow, terutama jenis pekerjaan yang berulang dan bisa diverifikasi.
https://engineering.atspotify.com/2025/11/spotifys-background-coding-agent-part-1
Claude Code documentation menyatakan agent dapat membaca codebase, mengedit file, menjalankan command, dan terintegrasi dengan development tools. Ini menunjukkan capability dasar agent modern, tetapi capability tidak sama dengan risk approval.
https://code.claude.com/docs/en/overview
Claude Code permission mode menyediakan pendekatan non-interactive restricted execution melalui pre-approved tools/permissions, relevan untuk CI atau background mode.
https://code.claude.com/docs/en/permission-modes
Claude Code sandboxing menekankan filesystem dan network isolation sebagai kontrol keamanan untuk agentic execution.
https://www.anthropic.com/engineering/claude-code-sandboxing
Model Context Protocol memisahkan tools, resources, dan prompts, yang berguna untuk membangun use-case-specific verifier dan context server.
https://modelcontextprotocol.io/specification/2025-06-18

1. Prinsip utama: pilih use case yang membangun trust

Use case pertama bukan harus yang paling impresif. Use case pertama harus yang paling mungkin menghasilkan:

PR kecil;
verifier kuat;
acceptance jelas;
risiko rendah;
value nyata;
review mudah;
failure mudah dipahami.

Banyak tim salah memilih target awal. Mereka langsung memilih “agent bisa ambil issue bebas dan implement feature”. Itu menarik untuk demo, tetapi buruk untuk platform foundation.

Untuk Honk-like background agent, use case awal yang baik biasanya berbentuk maintenance automation, bukan greenfield feature development.

Maintenance automation punya keuntungan:

objective biasanya lebih sempit;
pola perubahan berulang;
banyak contoh lama/baru;
verifier lebih jelas;
PR mudah direview;
bisa dijalankan di banyak repo;
value bisa dihitung dari waktu migrasi yang dihemat.

Prinsip:

Start where the change pattern is repetitive, bounded, and externally verifiable.
Avoid starting where success depends mostly on subjective product judgment.

2. Use case lane: autonomous, supervised, draft-only, blocked

Kita tidak akan memberi semua task mode yang sama.

Kita butuh lane:

Lane	Makna	Output
`autonomous_pr`	Agent boleh membuat PR jika gates pass.	PR siap review.
`supervised_pr`	Agent boleh jalan, tetapi butuh approval sebelum PR dibuat atau sebelum tool tertentu.	Draft diff atau PR setelah approval.
`draft_only`	Agent hanya membuat patch proposal, tidak membuat PR otomatis.	Diff artifact + explanation.
`analysis_only`	Agent hanya menganalisis repo dan membuat plan.	Report/plan.
`blocked`	Task tidak boleh dijalankan oleh agent.	Rejection with reason.

Diagram keputusan awal:

Lane bukan status permanen. Use case bisa naik lane setelah sistem punya data:

analysis_only -> draft_only -> supervised_pr -> autonomous_pr

Tetapi jangan lompat langsung ke autonomous untuk task berisiko tinggi.

3. Dimensi pemilihan use case

Kita gunakan delapan dimensi.

Dimensi	Pertanyaan utama
Clarity	Apakah objective bisa ditulis jelas?
Boundedness	Apakah path/symbol/change shape bisa dibatasi?
Verifiability	Apakah hasil bisa dibuktikan dengan build/test/static analysis?
Repeatability	Apakah pattern muncul di banyak repo/file?
Value	Apakah automation menghemat waktu nyata?
Blast radius	Jika salah, seberapa luas dampaknya?
Reversibility	Apakah rollback mudah?
Reviewability	Apakah PR bisa dibaca cepat oleh reviewer?

Scoring awal: 1 sampai 5.

Skor	Arti
1	buruk untuk automation
2	lemah, butuh supervision kuat
3	bisa dicoba sebagai draft/supervised
4	baik untuk automation
5	sangat cocok untuk automation

Kita hitung dua skor:

automation_fit = clarity + boundedness + verifiability + repeatability + value + reversibility + reviewability
risk_pressure = blast_radius_inverse_adjusted

Agar lebih jelas, kita pakai tabel scoring konkret.

4. Scoring table

4.1 Clarity

Skor	Kriteria
1	Objective subjektif: “improve design”, “make it better”.
2	Objective ada tetapi ambigu: “modernize auth”.
3	Objective cukup jelas tetapi detail acceptance kurang.
4	Objective jelas dan punya contoh before/after.
5	Objective jelas, punya migration guide, examples, forbidden changes.

4.2 Boundedness

Skor	Kriteria
1	Bisa menyentuh seluruh repo tanpa batas.
2	Batas module ada tetapi path/symbol belum jelas.
3	Allowed path bisa ditentukan.
4	Allowed/forbidden path dan expected diff shape jelas.
5	Bisa dibatasi dengan symbol-level atau AST-level rules.

4.3 Verifiability

Skor	Kriteria
1	Tidak ada test/build oracle yang relevan.
2	Hanya lint/syntax check.
3	Compile/build bisa dijalankan.
4	Unit test relevan tersedia.
5	Unit + integration/golden/static policy checks tersedia.

4.4 Repeatability

Skor	Kriteria
1	One-off, unik.
2	Mirip di beberapa file.
3	Muncul di beberapa repo.
4	Pattern berulang di banyak repo.
5	Fleet-wide maintenance campaign.

4.5 Value

Skor	Kriteria
1	Nice-to-have.
2	Menghemat sedikit waktu.
3	Menghapus backlog maintenance.
4	Menghindari deadline/platform cutoff.
5	Security/compliance/platform migration bernilai tinggi.

4.6 Blast radius

Untuk blast radius, skor tinggi berarti lebih aman.

Skor	Kriteria
1	Critical: auth/crypto/data destructive/public API external.
2	High: production behavior lintas module/service.
3	Medium: production path terbatas.
4	Low: internal mechanical change.
5	Very low: test/config/docs/non-runtime atau generated safe.

4.7 Reversibility

Skor	Kriteria
1	Tidak mudah rollback, data bisa berubah irreversible.
2	Rollback butuh coordinated deployment.
3	Normal revert tetapi ada runtime risk.
4	Normal git revert cukup.
5	No production effect sebelum merge/release.

4.8 Reviewability

Skor	Kriteria
1	Diff besar, banyak topik, sulit direview.
2	Banyak file dan reviewer harus memahami konteks besar.
3	Medium PR, satu topik.
4	Small PR, expected shape jelas.
5	Mechanical diff, mudah dicek cepat.

5. Decision rule awal

Kita bisa memakai rule sederhana:

total_score = clarity + boundedness + verifiability + repeatability + value + blast_radius + reversibility + reviewability

Lane default:

Total score	Lane default
34–40	`autonomous_pr` kandidat kuat
28–33	`supervised_pr`
21–27	`draft_only`
14–20	`analysis_only`
<=13	`blocked` atau reject

Tetapi ada hard blocker yang override skor:

hard_blockers:
  - destructive_database_migration
  - secret_or_credential_handling
  - production_authz_policy_change
  - cryptographic_algorithm_change
  - public_external_api_breaking_change_without_migration_plan
  - deletes_or_disables_tests_to_pass_verification
  - modifies_ci_to_skip_required_checks
  - requires_access_to_production_data
  - modifies_license_or_legal_files_without_approval

Jika hard blocker muncul, lane otomatis turun ke blocked atau minimal draft_only dengan explicit human approval.

6. Candidate use cases

Kita akan membahas lima keluarga use case utama:

dependency upgrade;
API migration;
config/schema migration;
mechanical refactor;
test fix/test generation.

Masing-masing punya shape, risiko, verifier, dan lane berbeda.

7. Use Case A — Dependency upgrade

7.1 Bentuk masalah

Dependency upgrade adalah target klasik untuk coding agent:

update version di pom.xml, build.gradle, package.json, go.mod;
build gagal karena API berubah;
agent memperbaiki call site;
test dijalankan;
PR dibuat.

Contoh:

<!-- before -->
<dependency>
  <groupId>com.company.platform</groupId>
  <artifactId>auth-client</artifactId>
  <version>2.8.1</version>
</dependency>

<!-- after -->
<dependency>
  <groupId>com.company.platform</groupId>
  <artifactId>auth-client</artifactId>
  <version>3.1.0</version>
</dependency>

7.2 Mengapa cocok

Dependency upgrade cocok jika:

migration guide ada;
breaking change terbatas;
compile error memberi feedback kuat;
test suite cukup baik;
versi target jelas;
rollback mudah.

7.3 Risiko

Risiko	Contoh
Transitive dependency conflict	Versi baru membawa dependency yang konflik.
Runtime behavior change	Compile pass tetapi behavior berubah.
Security regression	Dependency baru punya vulnerability.
Test adaptation cheating	Agent mengubah test agar tidak menguji behavior lama.
Lockfile noise	Banyak perubahan lockfile sulit direview.

7.4 Verifier

Minimal:

verification:
  - mvn -q test
  - mvn -q dependency:tree
  - dependency vulnerability scan
  - forbidden diff check for test deletion

Lebih kuat:

verification:
  - mvn -q -DskipITs=false verify
  - contract tests
  - smoke test with containerized dependencies
  - dependency convergence check

7.5 Lane rekomendasi

Kondisi	Lane
Patch version, no API changes	`autonomous_pr`
Minor version, compile fixes localized	`supervised_pr`
Major version with migration guide	`supervised_pr` atau `draft_only`
Security/auth/crypto dependency	`draft_only` dengan human approval
Unknown breaking changes	`analysis_only` dulu

7.6 Task contract contoh

id: CR-dep-auth-client-3
type: dependency_upgrade
repository: payments-api
objective: Upgrade auth-client from 2.8.1 to 3.1.0
allowed_paths:
  - pom.xml
  - src/main/java/**
  - src/test/java/**
forbidden_paths:
  - openapi/**
  - src/main/resources/db/migration/**
expected_diff_shape:
  - dependency version update
  - compile-error-driven call site adaptation
  - unit test updates only when API construction changed
forbidden_diff_shape:
  - deleting tests
  - disabling Maven plugins
  - skipping test phases
  - changing public REST contract
verifier_commands:
  - mvn -q test
  - mvn -q -DskipITs=false verify
risk:
  max_lane: supervised_pr
review_focus:
  - token validation behavior
  - exception mapping

8. Use Case B — API migration

8.1 Bentuk masalah

API migration terjadi ketika library/platform internal mengganti interface.

Contoh:

// before
PriceResponse response = priceClient.calculate(userId, items);

// after
PriceResponse response = priceClient.calculate(
    PriceRequest.builder()
        .userId(userId)
        .items(items)
        .build()
);

8.2 Mengapa cocok

API migration cocok jika:

before/after jelas;
pattern call site bisa ditemukan;
compile error membantu;
semantic mapping eksplisit;
migration guide tersedia.

8.3 Risiko

Risiko	Contoh
Semantic mismatch	Parameter lama dan field baru tidak satu-ke-satu.
Default value salah	Agent memilih default yang tidak sesuai bisnis.
Error handling berubah	Exception baru tidak dimap.
Performance change	API baru lebih mahal jika dipanggil dalam loop.
Partial migration	Beberapa call site tertinggal.

8.4 Verifier

verification:
  - compile
  - unit tests
  - grep forbidden old API usage
  - optional semantic test for mapping

Rule penting:

No old API usage may remain unless explicitly allowed.

8.5 Lane rekomendasi

Kondisi	Lane
Mechanical import/method rename	`autonomous_pr`
Constructor/request object adaptation	`supervised_pr`
Business semantic mapping	`draft_only`
Public API contract migration	`draft_only` or `blocked` without rollout plan

8.6 Pattern detector

Sebelum agent mengedit, kita bisa scan call sites:

rg "priceClient\.calculate\(" src/main/java src/test/java

Atau AST-level detector:

Find MethodInvocation where:
  receiver type = com.company.pricing.PriceClient
  method name = calculate
  argument count = 2

Semakin deterministic detector-nya, semakin aman automation-nya.

9. Use Case C — Config and schema migration

9.1 Bentuk masalah

Config migration:

# before
tracing:
  enabled: true
  sampleRate: 0.1

# after
observability:
  tracing:
    enabled: true
    sampling:
      rate: 0.1

Schema migration bisa berarti:

OpenAPI field rename;
JSON Schema version bump;
Avro schema evolution;
database migration;
generated client update.

9.2 Mengapa menarik

Config/schema sering muncul fleet-wide. Satu platform team bisa butuh ratusan repo mengikuti format baru.

9.3 Mengapa berbahaya

Config terlihat kecil tetapi runtime-critical.

Risiko:

Risiko	Contoh
Silent behavior change	Key salah membuat default dipakai.
Environment-specific issue	Dev pass, prod gagal karena env override.
Backward compatibility	Old config masih dibaca oleh service lama.
Schema compatibility	Consumer belum siap field baru.
Generated code noise	Diff besar dari generator.

9.4 Verifier

Untuk config:

verification:
  - config parser validation
  - application context startup test
  - schema validation
  - forbidden unknown key check

Untuk OpenAPI/JSON Schema/Avro:

verification:
  - schema validation
  - backward compatibility check
  - generated code deterministic check
  - contract tests

Untuk database schema:

verification:
  - migration applies cleanly on empty db
  - migration applies cleanly on previous schema snapshot
  - rollback strategy exists
  - destructive operation detection

Database destructive migration sebaiknya bukan use case awal autonomous.

9.5 Lane rekomendasi

Kondisi	Lane
Non-runtime config rename with validator	`autonomous_pr`
Runtime config with startup test	`supervised_pr`
OpenAPI additive change	`supervised_pr`
Avro/Protobuf compatible evolution	`supervised_pr`
DB additive migration	`draft_only` atau `supervised_pr` dengan approval
DB destructive migration	`blocked` untuk autonomous

10. Use Case D — Mechanical refactor

10.1 Bentuk masalah

Mechanical refactor adalah perubahan struktur kode tanpa niat mengubah behavior.

Contoh:

rename package;
replace deprecated annotation;
convert field injection to constructor injection;
replace utility method;
normalize logger declaration;
update import path;
replace test assertion library syntax.

10.2 Mengapa cocok

Mechanical refactor cocok karena expected diff shape jelas.

Contoh:

// before
@Inject
private PaymentService paymentService;

// after
private final PaymentService paymentService;

@Inject
public PaymentController(PaymentService paymentService) {
    this.paymentService = paymentService;
}

Tetapi tidak semua mechanical refactor rendah risiko. Constructor injection bisa memengaruhi framework wiring.

10.3 Risiko

Risiko	Contoh
Framework behavior	Annotation placement berubah efek runtime.
Reflection	Rename symbol merusak string-based lookup.
Serialization	Field/property name berubah.
Generated code	Agent mengedit file generated.
Over-refactor	Agent memperbaiki style unrelated.

10.4 Verifier

verification:
  - compile
  - unit tests
  - framework startup test
  - no generated file edit
  - no public contract diff

Untuk mechanical refactor yang sangat pattern-based, deterministic AST transform sering lebih baik daripada agent murni.

Prinsip:

If a transformation can be expressed safely as AST rules, do not delegate the core transformation to an LLM.
Use the agent for discovery, repair, explanation, and edge cases.

11. Use Case E — Test fix and test generation

11.1 Bentuk masalah

Test-related automation ada beberapa jenis:

memperbaiki test yang gagal karena API migration;
menambah test untuk uncovered bug;
memperbaiki flaky test;
menulis characterization test sebelum refactor;
memperbarui snapshot/golden file.

11.2 Mengapa sulit

Test bisa meningkatkan trust, tetapi agent juga bisa menyalahgunakan test.

Failure mode serius:

test dihapus;
assertion dilemahkan;
mock dibuat terlalu longgar;
snapshot diperbarui tanpa memahami behavior;
flaky test “fixed” dengan sleep lebih panjang;
bug disesuaikan ke test, bukan behavior diperbaiki.

11.3 Lane rekomendasi

Kondisi	Lane
Update test compile error akibat API signature berubah	`supervised_pr`
Add test for pure function with clear expected behavior	`supervised_pr` atau `autonomous_pr` setelah matang
Update snapshot with deterministic generator	`supervised_pr`
Fix flaky concurrency test	`draft_only`
Change test expectation for business rule	`draft_only` dengan owner approval
Delete/disable test	blocked unless explicit human approval

11.4 Test policy

test_policy:
  default_forbid:
    - deleting test files
    - disabling test classes
    - removing assertions without replacement
    - adding broad catch-ignore blocks
    - adding sleeps as primary flaky fix
    - changing expected business values without explanation
  allowed_when_justified:
    - adapting constructor setup to new API
    - updating imports
    - adding focused assertions
    - adding regression test for specified bug

12. Use case comparison matrix

Use case	Fit awal	Risiko utama	Verifier utama	Lane awal
Deprecated annotation replacement	sangat tinggi	reflection/framework nuance	compile + grep old usage	`autonomous_pr`
Dependency patch upgrade	tinggi	transitive dependency	test + dependency scan	`autonomous_pr`
Dependency major upgrade	medium	breaking behavior	verify + review	`supervised_pr`
Internal API method rename	tinggi	missed call sites	compile + grep	`autonomous_pr`/`supervised_pr`
Request object migration	medium	wrong field mapping	tests + judge	`supervised_pr`
Config key rename	medium	silent runtime default	config validation	`supervised_pr`
OpenAPI additive field	medium	consumer compatibility	schema compatibility	`supervised_pr`
DB additive migration	rendah-medium	deployment ordering	migration test	`draft_only`
DB destructive migration	rendah	data loss	not enough	`blocked`
Test generation for pure function	medium	weak assertions	mutation/coverage optional	`supervised_pr`
Flaky test fix	rendah	hiding real race	repeated test run	`draft_only`
Architecture redesign	rendah	subjective correctness	weak	`analysis_only`

13. Pilihan use case pertama untuk seri ini

Untuk seri build-from-scratch ini, kita akan memulai dengan kombinasi berikut:

Primary use case: Java internal API migration

Mengapa?

cocok dengan background kamu sebagai Java/backend engineer;
cukup realistis untuk enterprise codebase;
punya compile feedback kuat;
bisa dibuat dalam sample repo;
bisa menunjukkan multi-file cascading change;
bisa diperluas menjadi fleet migration;
tidak perlu production secret atau cloud access.

Bentuknya:

Migrate usage of deprecated `LegacyAuditClient.record(String actor, String action, String target)`
to `AuditClient.record(AuditEvent event)`.

Before:

legacyAuditClient.record(userId, "APPROVE_CASE", caseId);

After:

auditClient.record(
    AuditEvent.builder()
        .actor(userId)
        .action("APPROVE_CASE")
        .target(caseId)
        .source("case-management")
        .build()
);

Verifier:

mvn -q test
rg "LegacyAuditClient|legacyAuditClient\.record" src/main/java src/test/java

Risk:

medium jika audit path production-critical;
low-medium untuk sample repo;
supervised initially;
autonomous setelah policy/verifier/judge matang.

Secondary use case: dependency upgrade

Nanti kita gunakan untuk menunjukkan build failure repair loop.

Tertiary use case: config migration

Nanti kita gunakan untuk schema/config verifier dan fleet campaign.

14. Risk classification model

Kita akan membuat classification output seperti ini:

risk_classification:
  level: medium
  lane: supervised_pr
  reasons:
    - touches production audit path
    - compile verifier available
    - old API usage can be detected deterministically
    - no database or public API change expected
  required_gates:
    - allowed_path_check
    - forbidden_diff_check
    - compile_test
    - old_api_absence_check
    - llm_diff_judge
    - human_review
  disallowed_actions:
    - delete_tests
    - modify_ci
    - change_public_api
    - edit_database_migrations

Classifier tidak harus ML. Untuk awal, rule-based lebih baik.

Rule-based classifier:

Signals:

signals:
  touches_auth: false
  touches_authz: false
  touches_crypto: false
  touches_db_migration: false
  touches_public_api: false
  touches_ci: false
  touches_tests: true
  expected_file_count: 8
  has_compile_verifier: true
  has_unit_tests: true
  has_integration_tests: false
  old_api_detector_available: true
  rollback: git_revert

15. Policy mapping dari risk ke gates

Risk level	Lane	Required gates
very low	`autonomous_pr`	path check, build/test, diff summary
low	`autonomous_pr`	path check, forbidden diff, build/test, simple judge
medium	`supervised_pr`	all low gates + risk explanation + human review focus
high	`draft_only`	analysis, patch proposal, no automatic PR
critical	`blocked`	explain rejection

Contoh gate mapping:

risk_gate_policy:
  low:
    - validate_task_contract
    - enforce_allowed_paths
    - enforce_forbidden_paths
    - run_verifier_commands
    - run_forbidden_diff_rules
    - create_pr_if_pass
  medium:
    - validate_task_contract
    - enforce_allowed_paths
    - enforce_forbidden_paths
    - run_verifier_commands
    - run_forbidden_diff_rules
    - run_llm_diff_judge
    - require_pr_body_risk_section
    - create_pr_with_supervised_label
  high:
    - validate_task_contract
    - run_analysis
    - optionally_generate_patch
    - do_not_create_pr_without_approval
  critical:
    - reject_or_manual_process

16. Dataset awal untuk evaluasi use case

Sebelum menjalankan agent di repo nyata, kita butuh evaluation dataset.

Untuk primary use case API migration, buat beberapa sample repo/task:

Case	Deskripsi	Expected outcome
`simple-single-callsite`	Satu call site legacy API.	Patch berhasil.
`multiple-callsite`	Banyak call site di beberapa class.	Semua migrated.
`test-callsite`	Test juga memakai legacy API.	Test updated tanpa melemahkan assertion.
`ambiguous-field-mapping`	Parameter tidak jelas map ke field baru.	Agent stop atau ask approval.
`forbidden-path`	Legacy usage di generated file.	Tidak mengedit generated file.
`compile-failure-repair`	Perubahan awal compile fail.	Agent repair.
`no-tests`	Compile pass tapi tidak ada test relevan.	Lane turun atau warning.
`public-contract-risk`	Migration menyentuh API DTO.	Draft/supervised, tidak autonomous.

Folder struktur nanti:

evals/
  api-migration/
    simple-single-callsite/
      repo/
      task.yaml
      expected.patch
      rubric.yaml
    multiple-callsite/
      repo/
      task.yaml
      expected.patch
      rubric.yaml

Rubric contoh:

rubric:
  must:
    - no usage of LegacyAuditClient remains in src/main/java
    - mvn test passes
    - no test file deleted
    - no public API files changed
  should:
    - PR body mentions audit event field mapping
    - diff changes fewer than 10 files
  must_not:
    - modify pom.xml
    - disable tests
    - change database migration files

17. Implementation preview: risk classifier interface

Kita belum implement full platform, tetapi shape awalnya bisa dirancang.

TypeScript-like model:

type Lane =
  | "autonomous_pr"
  | "supervised_pr"
  | "draft_only"
  | "analysis_only"
  | "blocked";

type RiskLevel = "very_low" | "low" | "medium" | "high" | "critical";

interface ChangeRequest {
  id: string;
  type: string;
  repository: string;
  objective: string;
  allowedPaths: string[];
  forbiddenPaths: string[];
  expectedDiffShape: string[];
  forbiddenDiffShape: string[];
  verifierCommands: string[];
  metadata: Record<string, unknown>;
}

interface RiskClassification {
  level: RiskLevel;
  lane: Lane;
  score: number;
  reasons: string[];
  hardBlockers: string[];
  requiredGates: string[];
  disallowedActions: string[];
}

Rule function:

function classifyRisk(request: ChangeRequest): RiskClassification {
  const signals = extractSignals(request);
  const hardBlockers = detectHardBlockers(signals, request);

  if (hardBlockers.length > 0) {
    return {
      level: "critical",
      lane: "blocked",
      score: 0,
      reasons: ["Hard blocker detected"],
      hardBlockers,
      requiredGates: ["manual_review"],
      disallowedActions: ["agent_execution"]
    };
  }

  const score = scoreAutomationFit(signals, request);
  const { level, lane } = mapScoreToLane(score, signals);

  return {
    level,
    lane,
    score,
    reasons: explainScore(score, signals),
    hardBlockers: [],
    requiredGates: gatesFor(level),
    disallowedActions: disallowedActionsFor(level)
  };
}

Kita akan implement detail seperti ini nanti saat membangun control plane.

18. Stop conditions per use case

Agent harus punya stop condition yang jelas.

Untuk API migration:

stop_conditions:
  - verifier failed more than 3 times
  - diff touches forbidden paths
  - files changed exceeds 20
  - old API remains but no progress between iterations
  - agent wants to change public contract
  - agent wants to delete or disable tests
  - build failure unrelated to migration cannot be isolated

Untuk dependency upgrade:

stop_conditions:
  - dependency resolution cannot converge
  - major version requires unsupported runtime upgrade
  - vulnerability scan fails for target version
  - generated lockfile diff too large for review policy

Untuk config migration:

stop_conditions:
  - config parser unavailable
  - environment-specific values required
  - old and new keys need dual-write/dual-read rollout but task lacks rollout plan

Stop condition bukan kegagalan platform. Stop condition adalah safety feature.

19. PR label strategy berdasarkan lane

Agar reviewer langsung paham risiko, PR harus diberi label.

labels:
  autonomous_pr:
    - ai-agent
    - automation
    - risk:low
  supervised_pr:
    - ai-agent
    - needs-owner-review
    - risk:medium
  draft_only:
    - ai-agent-proposal
    - do-not-merge
  high_risk:
    - risk:high
    - manual-approval-required

PR title pattern:

[agent][api-migration] Migrate LegacyAuditClient usage to AuditClient in payments-api

Commit message pattern:

Migrate LegacyAuditClient usage to AuditClient

Generated by ai-coding-agent run CR-2026-000123.
Verification:
- mvn -q test: passed
- legacy API grep: passed

Traceability harus terlihat dari PR tanpa membuka internal dashboard.

20. Recommendation final untuk urutan build

Kita akan membangun use case dalam urutan ini:

Alasannya:

API migration memberi kita agent loop, file edit, grep detector, compile verifier, forbidden diff rule.
Dependency upgrade menambah package/build complexity dan repair loop.
Config migration menambah schema/config validation dan runtime startup concern.
Test generation menambah policy untuk mencegah test cheating.
Fleet campaign menambah batching, targeting, rollout, backoff, dan metrics.

Ini progresif. Setiap use case menambah satu dimensi sistem tanpa membakar trust terlalu awal.

21. Template use case card

Setiap use case dalam platform harus punya card.

use_case_card:
  id:
  name:
  owner_team:
  description:
  examples:
  non_examples:
  default_lane:
  allowed_repositories:
  allowed_paths:
  forbidden_paths:
  required_inputs:
  expected_diff_shape:
  forbidden_diff_shape:
  verifier_profile:
  risk_profile:
  stop_conditions:
  pr_template:
  success_metrics:
  rollout_policy:

Contoh:

use_case_card:
  id: java-api-migration-legacy-audit
  name: Migrate LegacyAuditClient to AuditClient
  owner_team: platform-audit
  description: Replace deprecated audit client call sites with AuditEvent-based API.
  examples:
    - legacyAuditClient.record(userId, "APPROVE_CASE", caseId)
  non_examples:
    - redesign audit taxonomy
    - change audit event semantics
  default_lane: supervised_pr
  allowed_repositories:
    - java-service
  allowed_paths:
    - src/main/java/**
    - src/test/java/**
  forbidden_paths:
    - openapi/**
    - src/main/resources/db/migration/**
    - generated/**
  required_inputs:
    - sourceApplicationName
  expected_diff_shape:
    - replace legacy audit client injection
    - construct AuditEvent object
    - update tests for constructor/API change
  forbidden_diff_shape:
    - remove audit calls
    - replace audit with logging only
    - disable tests
  verifier_profile: java-maven-standard
  risk_profile: medium-production-audit-path
  stop_conditions:
    - more than 20 files changed
    - old API remains after 3 repair attempts
  pr_template: api-migration-pr-template-v1
  success_metrics:
    - no old API usage remains
    - mvn test pass
    - reviewer requests fewer than 2 changes
  rollout_policy:
    batch_size: 5
    require_owner_review: true

22. Checklist pemahaman

Sebelum lanjut, pastikan kamu bisa menjawab:

Mengapa use case pertama harus membangun trust, bukan sekadar terlihat canggih?
Apa perbedaan autonomous_pr, supervised_pr, draft_only, analysis_only, dan blocked?
Apa delapan dimensi pemilihan use case?
Mengapa dependency upgrade cocok tetapi tetap berisiko?
Mengapa API migration bisa menjadi use case awal yang baik?
Kapan config migration aman dan kapan berbahaya?
Mengapa test generation perlu policy khusus?
Apa hard blocker yang harus menurunkan lane ke blocked?
Mengapa stop condition adalah fitur keselamatan?
Mengapa deterministic AST transform kadang lebih baik daripada LLM agent?

23. Latihan kecil

Pilih tiga candidate use case dari pekerjaanmu sendiri. Isi tabel berikut:

Use case	Clarity	Boundedness	Verifiability	Repeatability	Value	Blast radius safety	Reversibility	Reviewability	Lane

Lalu tulis satu use_case_card untuk candidate terbaik.

Jangan mulai dari prompt. Mulai dari classification.

24. Ringkasan

Use case selection adalah guardrail pertama dari AI coding agent.

Kita memilih pekerjaan berdasarkan:

objective clarity;
bounded scope;
verifier strength;
repeatability;
value;
blast radius;
reversibility;
reviewability.

Kita tidak memberi semua pekerjaan mode yang sama. Kita memakai lane:

autonomous_pr;
supervised_pr;
draft_only;
analysis_only;
blocked.

Untuk seri ini, use case utama kita adalah Java internal API migration karena ia cukup realistis, cukup menantang, tetapi masih bisa diverifikasi dengan compile/test/grep/policy/judge.

Di part berikutnya kita akan membuat requirements, non-functional requirements, invariant, dan acceptance criteria untuk platform ini. Itu akan mengubah domain model menjadi specification yang bisa diimplementasikan.

Lesson Recap

You just completed lesson 06 in start here. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 05

Learn Ai Coding Agent Part 005 Problem Domain Code Change Automation

Next Lesson

Lesson 07

Learn Ai Coding Agent Part 007 Requirements Functional Nonfunctional And Invariants