Kaufman Skill Map
Kaufman skill map untuk membongkar Python AI Application Engineering menjadi subskill yang bisa dilatih, diukur, dan dipakai membangun AI application production-grade.
// Kaufman skill map untuk membongkar Python AI Application Engineering menjadi subskill yang bisa dilatih, diukur, dan dipakai membangun AI application production-grade.
This track is ordered for sequential learning. Start from the first part if you want the full mental model, or jump directly into a chapter if you already know the foundations.
Ordered progression from foundations to advanced topics
Kaufman skill map untuk membongkar Python AI Application Engineering menjadi subskill yang bisa dilatih, diukur, dan dipakai membangun AI application production-grade.
Mental model peran Python AI Application Engineer, batasannya dengan ML/Data/Platform roles, dan cara berpikir production-grade untuk sistem AI probabilistik.
Mendesain arsitektur aplikasi LLM end-to-end yang production-grade: boundary, lifecycle request, context, tools, retrieval, eval, observability, reliability, dan governance.
Struktur project Python AI application yang maintainable, testable, observable, eval-first, dan siap production tanpa terjebak framework-first design.
Model interface and provider abstraction untuk membangun aplikasi AI Python yang tidak terkunci pada satu vendor/model, tetap typed, observable, testable, dan siap production.
Prompting sebagai protocol design: cara mendesain instruksi LLM yang modular, versioned, testable, auditable, aman, dan bisa dipakai di production AI application.
Structured output, schema design, validation, repair loops, and typed contracts for production-grade Python AI applications.
Tool calling, function contracts, authorization, idempotency, approval gates, and auditability for production-grade Python AI applications.
Conversation state, context management, memory boundaries, summarization, context packing, and auditability for production-grade Python AI applications.
Async Python, streaming responses, cancellation, timeout, backpressure, queues, and runtime reliability for production-grade AI applications.
Embeddings, semantic representation, similarity, vector records, embedding pipelines, quality diagnostics, and production retrieval foundations for Python AI applications.
Production document ingestion and parsing pipelines for AI applications, including source connectors, canonical elements, provenance, metadata, idempotency, quality gates, and regulatory auditability.
Chunking, indexing, and knowledge modeling for production-grade RAG systems.
Vector search, hybrid retrieval, reranking, filtering, and ranking pipelines for production-grade RAG.
End-to-end RAG pipeline design for production AI applications, including query planning, retrieval orchestration, context assembly, answer contracts, citations, refusal, and observability.
Systematic diagnosis of RAG failure modes across ingestion, chunking, indexing, retrieval, reranking, context assembly, generation, citations, and production operations.
Enterprise RAG knowledge systems: tenancy, permissions, metadata, source authority, freshness, lineage, governance, auditability, and knowledge operations.
Agent mental model for production AI applications: perception, planning, tool use, state, memory, policies, autonomy boundaries, and failure control.
Agent workflow orchestration with state machines, graph execution, deterministic nodes, model decision nodes, human approval, checkpointing, retries, interrupts, and production tracing.
Tool registry, Model Context Protocol, and integration contracts for safe, typed, observable, and permission-aware AI tool use.
Agent memory and long-running task engineering: working state, durable memory, checkpoints, resumability, interrupts, approvals, retention, privacy, and recovery.
Multi-agent systems and boundaries: when to use multiple agents, coordination patterns, supervisor routing, handoffs, shared state, failure isolation, evaluation, and anti-patterns.
Evaluation foundations for production AI applications: eval-first mindset, datasets, rubrics, metrics, regression gates, scenario design, calibration, and release readiness.
Practical evaluation for RAG and agent systems: retrieval metrics, groundedness, citation accuracy, tool correctness, trajectory scoring, safety gates, and diagnostic eval reports.
LLM-as-judge and human review for AI application evaluation: rubric design, calibration, bias control, disagreement handling, adjudication, quality sampling, and review operations.
Testing AI applications across deterministic code, prompts, structured outputs, providers, RAG, tools, agents, workflows, safety, regression, and CI release gates.
Observability, tracing, and debugging for production AI applications: GenAI telemetry, prompt/model traces, retrieval traces, tool traces, agent traces, metrics, logs, replay, privacy, and incident diagnosis.
Reliability patterns for AI systems: timeout budgets, retries with jitter, fallback, circuit breakers, bulkheads, rate limits, backpressure, idempotency, graceful degradation, chaos testing, and operational runbooks.
Latency, cost, and throughput engineering for production AI applications: token economics, TTFT, streaming, batching, caching, model routing, retrieval budgets, concurrency, queues, and capacity planning.
Security threat modeling for LLM applications: prompt injection, data exfiltration, insecure tool use, excessive agency, insecure output handling, RAG poisoning, supply-chain risk, and defense-in-depth.
Privacy, governance, and auditability for production AI applications: data classification, consent, retention, lineage, DPIA-style review, model/provider governance, audit trails, policy controls, and regulated workflow defensibility.
Deployment architecture and runtime operations for Python AI applications: service topology, model gateway, RAG services, workers, queues, Kubernetes, secrets, rollout/rollback, scaling, health checks, SLOs, and operational runbooks.
AI CI/CD and readiness gates for production AI systems: prompt/model/index/tool/workflow versioning, eval gates, security gates, cost gates, release trains, canary, shadow, rollback, and production readiness review.
Enterprise case-management AI capstone integrating RAG, agents, tools, evaluation, security, governance, deployment, observability, reliability, and operations into one production architecture.
Top one percent operational playbook for Python AI application engineers: principles, review checklists, incident habits, architecture judgment, decision records, career leverage, and mastery loops.