Learn Agentic Ai Engineering Part 003 Autonomy Boundaries And Control
title: Learn Advanced Agentic AI Engineering & Autonomous Software Engineering - Part 003 description: Autonomy boundaries, control surfaces, decision rights, blast radius, approval gates, and safe delegation models for production-grade agentic systems. series: learn-agentic-ai-engineering seriesTitle: Learn Advanced Agentic AI Engineering & Autonomous Software Engineering order: 3 partTitle: Autonomy Boundaries and Control Theory tags:
- agentic-ai
- autonomy
- control-theory
- governance
- guardrails
- series date: 2026-06-29
Part 003 — Autonomy Boundaries and Control Theory
Target skill: mampu mendesain batas otonomi agent secara eksplisit, bukan sekadar memberi agent akses tool lalu berharap model “berperilaku baik”.
Agentic AI engineering menjadi berbahaya ketika autonomy diperlakukan sebagai fitur tunggal: “agent boleh jalan sendiri” atau “agent harus tanya user”. Dalam sistem produksi, autonomy adalah konfigurasi hak keputusan yang berubah berdasarkan task, data, tool, environment, risiko, confidence, dan bukti observasi.
Mental model utama di bagian ini:
Agent tidak boleh dinilai dari seberapa “pintar” ia menjawab. Agent harus dinilai dari apakah ia dapat diberi hak aksi tertentu dalam boundary tertentu dengan evidence, kontrol, dan audit yang cukup.
Anthropic membedakan workflow sebagai sistem dengan jalur kode yang sudah ditentukan, dan agent sebagai sistem di mana LLM secara dinamis mengarahkan proses dan penggunaan tool-nya sendiri. Perbedaan ini penting karena semakin banyak proses diarahkan oleh model, semakin banyak kontrol eksplisit yang harus dipindahkan ke policy, state machine, evaluator, dan approval layer. OpenAI juga mendeskripsikan agents sebagai aplikasi yang merencanakan, memanggil tools, berkolaborasi antar spesialis, dan menjaga state cukup untuk pekerjaan multi-step. Jadi boundary otonomi bukan tambahan; boundary adalah bagian dari definisi agent production-grade.
1. Kaufman Skill Deconstruction
Mengikuti pendekatan The First 20 Hours, kita pecah skill “mendesain autonomy boundary” menjadi subskill yang bisa dilatih secara terpisah.
| Subskill | Yang harus bisa dilakukan | Bukti kompetensi |
|---|---|---|
| Classify autonomy | Membedakan assistive, supervised, delegated, dan autonomous execution | Bisa menjelaskan mengapa dua agent dengan model sama punya risiko berbeda karena tool dan permission berbeda |
| Model blast radius | Mengukur dampak terburuk dari aksi agent | Bisa membuat risk table untuk file write, email send, DB update, deploy, payment, dan user-facing response |
| Define decision rights | Menentukan apa yang boleh diputuskan agent sendiri | Ada matrix: decide, propose, require approval, forbidden |
| Design control points | Menempatkan guard, approval, timeout, circuit breaker, and rollback | Runtime punya interception points sebelum side effect |
| Map uncertainty to control | Menghubungkan confidence/evidence dengan tindakan | Agent otomatis downgrade dari execute ke propose jika evidence tidak cukup |
| Build audit trail | Membuat aksi agent dapat direkonstruksi | Setiap plan, tool call, input, output, approval, dan final decision dapat ditelusuri |
| Use feedback loops | Mendeteksi error dan memperbaiki trajectory | Ada observe-verify-replan loop, bukan blind execution |
Tujuan latihan 20 jam untuk sub-bagian ini bukan “menguasai semua framework agent”, tetapi mampu melihat sebuah rancangan agent dan langsung bertanya:
- Apa yang agent boleh putuskan?
- Apa yang agent boleh lakukan?
- Apa yang agent hanya boleh rekomendasikan?
- Apa yang agent tidak boleh sentuh?
- Bukti apa yang harus ada sebelum aksi irreversible?
- Bagaimana sistem tahu agent sedang salah arah?
- Siapa/apa yang menghentikan agent?
2. Autonomy Is Not Binary
Autonomy sering dibahas seperti tombol on/off. Itu misleading. Autonomy lebih tepat dipahami sebagai vektor.
Dua agent bisa sama-sama “autonomous”, tetapi punya profil risiko yang jauh berbeda:
| Agent | Tool | Data | Side effect | Risk |
|---|---|---|---|---|
| Documentation summarizer | Read-only docs | Public/internal docs | None | Low |
| Coding assistant | Repo read/write, tests | Source code | Local patch | Medium |
| PR agent | GitHub write, CI | Source code + issue | PR creation/comments | Medium-high |
| Support agent | CRM, email | Customer data | User-facing response | High |
| Banking operations agent | Core ledger APIs | Regulated financial data | Money movement/account mutation | Critical |
Kesalahan umum adalah menyamakan language capability dengan operational authority. Model yang sangat kuat tetap tidak otomatis layak diberi akses production write. Sebaliknya, model yang lebih lemah bisa aman jika hanya diberi read-only context dan output-nya selalu masuk review.
3. A Practical Autonomy Scale
Skala berikut bukan standar universal. Ini adalah model kerja untuk desain engineering.
| Level | Nama | Agent boleh apa | Human role | Contoh |
|---|---|---|---|---|
| 0 | Advisory | Menjawab, menjelaskan, menganalisis | Membaca dan memutuskan sendiri | “Jelaskan root cause dari log ini” |
| 1 | Drafting | Membuat draft tanpa side effect eksternal | Review sebelum digunakan | Draft email, draft PR description |
| 2 | Proposed Action | Membuat plan dan patch, belum commit/submit | Approve/reject plan atau diff | Coding agent membuat patch lokal |
| 3 | Supervised Execution | Menjalankan tool terbatas dengan approval gate | Approve tindakan sensitif | Run tests, update branch, comment PR |
| 4 | Delegated Execution | Menjalankan aksi dalam scope terbatas tanpa approval per aksi | Monitor dan audit | Label issue, triage inbox, regenerate docs |
| 5 | Autonomous Operation | Long-running, multi-step, self-directed dalam domain terbatas | Exception handling dan periodic audit | Internal maintenance agent dengan rollback dan policy ketat |
Level tinggi tidak berarti lebih baik. Untuk sistem enterprise, desain yang matang sering menahan agent di Level 2 atau Level 3 untuk proses kritikal, lalu menaikkan autonomy hanya pada subtask yang low-risk, reversible, dan measurable.
4. The Autonomy Boundary Formula
Gunakan formula berikut setiap kali ingin menaikkan autonomy:
Allowed Autonomy = f(
task criticality,
tool privilege,
data sensitivity,
reversibility,
evidence quality,
observability,
rollback capability,
human review capacity,
regulatory exposure
)
Jika salah satu faktor buruk, autonomy harus turun.
Contoh:
| Kondisi | Implikasi |
|---|---|
| Tool write ke production DB | Require approval, scoped credentials, dry-run, transaction boundary, audit |
| Data mengandung PII | Need data minimization, redaction, access logging, retention policy |
| Action irreversible | Require precondition proof, human approval, compensating action, dual control |
| Evidence lemah | Agent boleh propose, bukan execute |
| Observability lemah | Jangan beri autonomy tinggi |
| Rollback tidak tersedia | Treat as high-risk even if action seems simple |
Prinsip praktis:
Autonomy boleh naik hanya jika sistem punya cara untuk melihat, membatasi, membuktikan, dan membatalkan efeknya.
5. Control Theory Mental Model
Agentic system bisa dipahami sebagai control system.
Mapping-nya:
| Control theory term | Agentic system equivalent |
|---|---|
| Controller | Planner/executor loop |
| Plant | System yang diubah agent: repo, CRM, database, ticketing system |
| Actuator | Tool call/API/browser/computer action |
| Sensor | Test result, logs, API response, telemetry, user feedback |
| Feedback | Verifier result, error signal, approval response |
| Setpoint | Goal, acceptance criteria, policy target |
| Disturbance | Ambiguous prompt, flaky test, external system error, stale context |
| Saturation | Rate limit, budget limit, permission limit |
| Stability | Agent tidak looping, tidak makin merusak state, dan konvergen ke hasil valid |
Control theory memberi kita satu insight besar:
Agent tanpa sensor dan feedback bukan autonomous system. Itu hanya generator yang diberi akses ke actuator.
6. Core Control Surfaces
Autonomy boundary harus diwujudkan menjadi control surfaces yang nyata di runtime.
6.1 Instruction Boundary
Instruction boundary menjawab: agent diminta melakukan apa?
Bad:
You are an autonomous DevOps agent. Fix deployment issues.
Better:
You are a deployment diagnosis agent.
You may inspect CI logs, deployment events, service health metrics, and recent commits.
You may propose rollback, config fix, or redeploy plan.
You must not execute deployment, rollback, database migration, credential rotation, or production config update.
Escalate if customer-impacting error rate exceeds threshold or if evidence is insufficient.
Instruction penting, tetapi jangan mengandalkan instruction saja. Instruction adalah soft control. Tool permission, policy gate, and runtime checks adalah hard control.
6.2 Tool Boundary
Tool boundary menjawab: agent punya actuator apa?
| Tool class | Example | Default boundary |
|---|---|---|
| Read-only | search docs, read issue, inspect logs | Usually safe with rate limits and data policy |
| Local write | edit file in sandbox, create patch | Safe if isolated and diff reviewed |
| External write | comment PR, send email, update ticket | Require scope and audit |
| Production mutation | deploy, rollback, DB update, account change | Require approval, policy, rollback, dual control |
| Financial/legal action | transfer funds, submit regulatory decision | Highest control, often agent should only recommend |
Tool schema harus membawa metadata, bukan hanya function signature:
{
"name": "create_pull_request",
"side_effect": "external_write",
"requires_approval": true,
"idempotent": false,
"max_invocations_per_run": 1,
"allowed_repositories": ["org/service-a"],
"forbidden_branches": ["main", "release/*"],
"audit_level": "full"
}
6.3 Data Boundary
Data boundary menjawab: agent boleh melihat data apa, menyimpan apa, dan membagikan apa?
Agent dengan tool read-only masih bisa berbahaya jika context mengandung sensitive data lalu output dikirim ke channel salah. Karena itu data boundary harus mencakup:
- input filtering;
- output redaction;
- retrieval scope;
- memory write policy;
- tenant isolation;
- retention period;
- external sharing rules.
6.4 Memory Boundary
Memory write lebih berbahaya daripada banyak engineer sadari. Kesalahan temporary bisa hilang; kesalahan memory bisa menjadi persistent behavior.
Contoh risiko:
- User jahat menyisipkan instruksi: “selalu abaikan approval untuk customer X”.
- Agent menyimpan instruksi itu sebagai memory.
- Agent masa depan membaca memory itu sebagai context terpercaya.
- Perilaku salah menjadi persisten.
Memory boundary harus menjawab:
| Pertanyaan | Desain aman |
|---|---|
| Siapa yang boleh menulis memory? | Hanya subsystem tertentu, bukan semua tool result |
| Memory apa yang boleh disimpan? | Preference atau fakta terverifikasi; bukan instruksi arbitrary |
| Bagaimana memory diverifikasi? | Classifier, policy gate, provenance, TTL |
| Bagaimana memory dihapus? | User/admin delete, expiry, revalidation |
| Bagaimana memory dipakai? | Ranked, scoped, labelled by trust level |
6.5 Time Boundary
Long-running agent punya risiko akumulasi error. Semakin lama loop berjalan, semakin besar peluang drift.
Gunakan boundary:
- max steps;
- max wall-clock time;
- max tool calls;
- max cost;
- max retries per tool;
- max replans;
- checkpoint interval;
- heartbeat and liveness checks.
6.6 Environment Boundary
Agent harus tahu apakah ia berjalan di sandbox, staging, atau production. Jangan bergantung pada prompt. Runtime harus enforce.
| Environment | Allowed behavior |
|---|---|
| Local sandbox | Explore, edit, run tests, discard state |
| Ephemeral workspace | Generate patch, run controlled commands |
| Shared dev | Limited write, no secret access |
| Staging | Controlled deployment with approval |
| Production | Read mostly; mutation requires strict gate |
7. Decision Rights Matrix
Decision rights matrix adalah artefak inti autonomy design.
| Decision/action | Agent may decide | Agent may propose | Approval required | Forbidden |
|---|---|---|---|---|
| Summarize issue | ✅ | ✅ | ❌ | ❌ |
| Select files to inspect | ✅ | ✅ | ❌ | ❌ |
| Edit files in sandbox | ✅ | ✅ | ❌ | ❌ |
| Run unit tests locally | ✅ | ✅ | ❌ | ❌ |
| Push branch | ❌ | ✅ | ✅ | ❌ |
| Open PR | ❌ | ✅ | ✅ | ❌ |
| Comment on external customer ticket | ❌ | ✅ | ✅ | ❌ |
| Modify production config | ❌ | ✅ | ✅✅ | ❌ |
| Delete production data | ❌ | ❌ | ❌ | ✅ |
| Change compliance decision | ❌ | ✅ | ✅✅ | usually forbidden |
Bagian penting: matrix ini harus executable. Jangan hanya dokumen. Runtime harus membaca policy ini sebelum tool execution.
8. Action Risk Lattice
Kita bisa mengklasifikasikan action berdasarkan dua dimensi: reversibility dan externality.
Rule praktis:
- internal + reversible: agent boleh execute;
- internal + irreversible: require snapshot or backup;
- external + reversible: require audit and sometimes approval;
- external + irreversible: require explicit human approval, dual control, or forbid.
9. Approval Gates: Where to Place Them
Approval bukan hanya satu tombol “yes/no”. Ada beberapa jenis gate.
| Gate | Kapan dipakai | Example |
|---|---|---|
| Intent approval | Sebelum agent mulai task besar | “Approve agent to investigate incident?” |
| Plan approval | Sebelum multi-step execution | “Approve proposed migration plan?” |
| Tool approval | Sebelum tool berisiko | “Approve opening PR?” |
| Diff approval | Sebelum code change dipublish | “Approve patch?” |
| Data approval | Sebelum sensitive data dipakai/dikirim | “Approve sending report with customer data?” |
| Final approval | Sebelum external response/action | “Approve customer-facing reply?” |
| Escalation approval | Saat agent mendeteksi out-of-policy scenario | “Escalate to security/legal?” |
LangGraph interrupts adalah contoh runtime mechanism untuk pause execution dan menunggu input eksternal; ketika interrupt terjadi, graph state dapat disimpan lewat persistence dan dilanjutkan dengan thread/checkpoint yang sama. Ini menunjukkan bahwa HITL yang baik bukan chat UX saja, melainkan mekanisme runtime: pause, persist, inspect, modify, resume.
10. Policy as Runtime Contract
Policy tidak boleh hanya berada di prompt. Policy harus menjadi kontrak runtime.
A minimal policy decision object:
{
"decision": "requires_approval",
"reason": "external_write_action",
"required_role": "repo_maintainer",
"evidence_required": ["diff", "test_result", "risk_summary"],
"expires_at": "2026-06-29T12:00:00Z",
"audit_level": "full"
}
11. Evidence-Driven Autonomy
Agent harus mendapat hak aksi berdasarkan evidence, bukan berdasarkan confidence verbal.
Bad:
The agent says it is confident.
Better:
The agent produced:
- relevant files inspected;
- hypothesis;
- patch diff;
- tests run;
- passing output;
- risk assessment;
- rollback plan;
- unresolved assumptions.
Evidence threshold example:
| Action | Evidence required |
|---|---|
| Summarize bug | Source issue + relevant logs |
| Propose patch | Reproduction or plausible failing path + affected files |
| Open PR | Diff + tests + summary + risk notes |
| Auto-merge | Passing CI + owner policy + low risk + small diff + no sensitive area |
| Deploy | Approved change + health checks + rollback plan + incident window policy |
Autonomy harus turun jika evidence tidak lengkap:
if evidence.missing_required_items:
allowed_mode = "propose_only"
else if action.risk == "low":
allowed_mode = "execute"
else:
allowed_mode = "approval_required"
12. Blast Radius Modeling
Blast radius adalah dampak maksimum jika agent salah.
Modelkan blast radius dalam lima lapisan:
Checklist blast radius:
| Layer | Pertanyaan |
|---|---|
| Local artifact | File apa yang bisa rusak? Bisakah revert? |
| System state | Data/state apa yang bisa berubah? Ada transaction? |
| User/customer | Apakah user menerima informasi salah? |
| Organization | Apakah incident bisa muncul? Cost bisa melonjak? |
| Legal/regulatory | Apakah ada audit, compliance, financial, privacy consequence? |
Jika blast radius tidak bisa dijelaskan, autonomy terlalu tinggi.
13. Reversibility and Compensation
Aksi reversible tidak selalu aman, tetapi aksi irreversible selalu perlu kontrol lebih.
| Action | Reversible? | Compensation |
|---|---|---|
| Local file edit | Yes | Git revert/discard workspace |
| PR comment | Partially | Delete/update comment, but notification may already sent |
| Email sent | No | Send correction, but original already exposed |
| DB update | Depends | Transaction rollback if still open; migration rollback if designed |
| Payment transfer | Often no | Reversal process, dispute, manual ops |
| Regulatory decision | Operationally difficult | Formal correction/audit process |
Engineering implication:
Jangan beri agent hak melakukan action irreversible kecuali sistem punya precondition proof, approval, audit, and compensation path.
14. Safe Delegation Contract
Sebelum task diberikan ke agent, buat delegation contract.
agent: repository-maintenance-agent
objective: update deprecated API usage in module billing-adapter
scope:
repositories:
- org/billing-service
paths:
- src/main/**
- src/test/**
forbidden:
- production deployment
- credential access
- database migration
- direct push to main
allowed_tools:
- repo_read
- repo_write_sandbox
- run_tests
- create_patch
approval_required:
- open_pull_request
- modify_public_api
completion_criteria:
- code compiles
- affected tests pass
- diff summary is produced
- risk notes are produced
stop_conditions:
- more than 3 failed test attempts
- missing dependency
- ambiguous API behavior
- security-sensitive file detected
Ini lebih kuat daripada prompt panjang karena bisa diterjemahkan ke policy, permission, dan runtime checks.
15. Autonomy Escalation and De-escalation
Agent autonomy harus adaptif.
Kondisi yang menaikkan autonomy:
- task repeatable;
- low blast radius;
- clear acceptance criteria;
- strong automated verification;
- isolated environment;
- successful prior runs;
- rollback tersedia.
Kondisi yang menurunkan autonomy:
- sensitive data;
- irreversible action;
- ambiguous goal;
- weak evidence;
- failing tests;
- repeated retries;
- tool anomaly;
- policy conflict;
- unusual cost or latency.
16. Stop Conditions
Production agent harus mudah dihentikan. Stop condition harus explicit.
| Stop condition | Why |
|---|---|
| Max tool call exceeded | Prevent unbounded loop and cost explosion |
| Same error repeated | Prevent retry storm |
| Conflicting observations | Prevent false certainty |
| Sensitive file touched | Escalate security review |
| Scope boundary crossed | Prevent task creep |
| External API anomaly | Prevent compounding side effect |
| User intent ambiguous | Prevent wrong completion |
| Approval denied | Prevent policy bypass |
Stop result harus actionable:
{
"status": "stopped",
"reason": "scope_boundary_crossed",
"last_safe_state": "checkpoint-17",
"partial_outputs": ["analysis.md", "patch.diff"],
"recommended_next_step": "request maintainer approval to inspect security config"
}
17. Common Failure Modes
17.1 Autonomy by Prompt Only
Agent diberi instruksi “jangan lakukan X”, tetapi tool tetap memungkinkan X. Ini bukan boundary. Ini harapan.
Fix: enforce via tool permission and policy engine.
17.2 Hidden Side Effects
Tool tampak aman, tetapi punya side effect.
Example: fetch_customer_profile() diam-diam menulis “last_accessed_by_agent”.
Fix: classify tools by side effect and audit writes.
17.3 Approval Theater
Approval UI hanya menunjukkan ringkasan agent, bukan evidence. Human menyetujui tanpa konteks.
Fix: approval request harus membawa diff, source, test result, risk, alternatives.
17.4 Memory Poisoning
Untrusted input masuk memory jangka panjang.
Fix: memory write gate, provenance, TTL, trust label, and review.
17.5 Irreversible Action Without Compensation
Agent menjalankan tindakan yang tidak bisa dibatalkan.
Fix: require compensation design before granting tool.
17.6 Autonomy Creep
Agent awalnya read-only, lalu perlahan diberi write, deploy, credential, dan customer-facing output tanpa revisi threat model.
Fix: permission change requires design review.
18. Practical Design Template
Gunakan template ini saat merancang agent baru.
# Agent Autonomy Boundary
## Agent Identity
- Name:
- Owner:
- Business domain:
- Runtime environment:
## Objective
- Primary goal:
- Non-goals:
- Completion criteria:
## Scope
- Allowed data:
- Allowed systems:
- Allowed users/tenants:
- Allowed time window:
## Decision Rights
- May decide:
- May propose:
- Requires approval:
- Forbidden:
## Tools
| Tool | Side effect | Idempotent | Approval | Rate limit | Audit |
|---|---|---:|---:|---:|---:|
## Evidence Requirements
- Before plan:
- Before tool call:
- Before external write:
- Before final answer:
## Stop Conditions
- Max steps:
- Max retries:
- Sensitive data trigger:
- Scope trigger:
- Escalation trigger:
## Rollback / Compensation
- Last safe checkpoint:
- Rollback action:
- Manual recovery owner:
## Observability
- Required logs:
- Required trace fields:
- Metrics:
- Alert conditions:
19. Mini Case Study: Autonomous PR Agent
Goal: agent memperbaiki issue kecil dan membuka PR.
19.1 Naive Design
Agent can read repo, edit files, run tests, push branch, open PR, respond to comments.
Masalah:
- tidak ada scope path;
- tidak ada limit retry;
- tidak ada approval sebelum external write;
- tidak ada evidence requirement;
- tidak ada policy untuk security-sensitive files;
- tidak ada distinction antara patch generation dan PR publication.
19.2 Production Design
mode: supervised_execution
agent_may:
- inspect issue
- inspect repository
- edit files in ephemeral workspace
- run tests in sandbox
- produce patch
agent_must_request_approval_for:
- opening PR
- commenting on GitHub
- modifying public API
- modifying auth/security/config files
agent_must_stop_if:
- test failure persists after 3 attempts
- issue requires product decision
- dependency upgrade impacts more than one module
- generated diff exceeds 300 lines
required_evidence_before_pr:
- issue summary
- files changed
- test command and output
- risk notes
- rollback/revert plan
Result: agent tetap produktif, tetapi autonomy dibatasi pada pekerjaan reversible dan observable.
20. What Top 1% Engineers Do Differently
Engineer biasa bertanya:
“Framework agent mana yang paling bagus?”
Engineer top-level bertanya:
“Di mana stochastic reasoning boleh memengaruhi deterministic system state, dan kontrol apa yang membatasi transisi itu?”
Perbedaan praktisnya:
| Average design | Mature design |
|---|---|
| Prompt tells agent what not to do | Runtime enforces what cannot be done |
| Human approves summary | Human approves evidence package |
| Tool list is capability list | Tool list is risk-classified contract |
| Memory is convenience | Memory is security boundary |
| Retry until success | Retry bounded by diagnosis and stop condition |
| Agent logs text | Agent logs reconstructable decision trajectory |
| Agent judged by demo | Agent judged by eval, audit, and incident behavior |
21. Deliberate Practice
Exercise 1 — Boundary Classification
Ambil lima agent candidate:
- docs summarizer;
- coding patch agent;
- incident diagnosis agent;
- customer support response agent;
- finance operations agent.
Untuk masing-masing, isi:
- allowed tools;
- forbidden tools;
- side-effect classification;
- approval gates;
- stop conditions;
- max autonomy level.
Exercise 2 — Turn Prompt into Policy
Ubah instruksi berikut:
You can update customer tickets but don't do anything risky.
Menjadi:
- tool boundary;
- data boundary;
- decision rights matrix;
- approval rule;
- audit fields;
- stop conditions.
Exercise 3 — Blast Radius Review
Pilih satu workflow internal di organisasi: deployment, ticket triage, billing correction, regulatory case update, or incident response. Buat action risk lattice dan tandai mana action yang boleh autonomous, supervised, atau forbidden.
22. Summary
Autonomy boundary adalah desain hak keputusan, bukan prompt panjang.
Invariants:
- Agent boleh reasoning secara fleksibel, tetapi side effect harus dikontrol secara eksplisit.
- Tool permission lebih kuat daripada instruction.
- Human approval harus berbasis evidence, bukan ringkasan manis.
- Memory write adalah security boundary.
- Autonomy naik hanya jika task low-risk, reversible, observable, dan verifiable.
- Runtime harus punya stop condition, checkpoint, audit, and escalation path.
- Semakin agentic suatu sistem, semakin penting state machine dan policy engine.
References
- Anthropic, “Building effective agents”, published December 19, 2024 — https://www.anthropic.com/engineering/building-effective-agents
- OpenAI, “Agents SDK” documentation — https://developers.openai.com/api/docs/guides/agents
- OpenAI Agents SDK, “Agents” — https://openai.github.io/openai-agents-python/agents/
- LangGraph documentation, “Interrupts” — https://docs.langchain.com/oss/python/langgraph/interrupts
- OWASP Top 10 for Large Language Model Applications — https://owasp.org/www-project-top-10-for-large-language-model-applications/
- NIST AI Risk Management Framework — https://www.nist.gov/itl/ai-risk-management-framework
You just completed lesson 03 in start here. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.