Build CoreOrdered learning track

Learn Ai Code Documentation Agent Memory Part 019 Doc Quality Gates

[]12 min read2338 words

In This Lesson

1. Tujuan Part Ini 2. Kenapa Quality Gates Wajib 3. Quality Dimensions

Lesson 1935 lesson track07–19 Build Core

title: Learn AI Code Documentation & Agent Memory Platform - Part 019 description: Doc quality gates untuk mengevaluasi accuracy, completeness, freshness, traceability, style, duplication, review readiness, doc debt, dan automated evaluation dalam platform AI documentation. series: learn-ai-code-documentation-agent-memory seriesTitle: Learn AI Code Documentation & Agent Memory Platform order: 19 partTitle: Doc Quality Gates tags:

ai
documentation
quality-gates
doc-evaluation
code-intelligence
provenance
software-architecture date: 2026-07-02

Part 019 — Doc Quality Gates

1. Tujuan Part Ini

Part 018 membahas pipeline code-to-doc generation. Sekarang kita membahas bagian yang menentukan apakah output documentation generator layak dipercaya: doc quality gates.

Tanpa quality gates, AI documentation platform akan menghasilkan banyak teks dengan kualitas tidak stabil. Beberapa akurat, beberapa terlalu umum, beberapa stale, beberapa hallucinated, beberapa mengulang source code, beberapa berbahaya karena menyembunyikan uncertainty.

Quality gates adalah mekanisme untuk menjawab:

Apakah dokumen akurat?
Apakah claim punya evidence?
Apakah scope-nya benar?
Apakah dokumen cukup lengkap untuk doc type-nya?
Apakah source masih fresh?
Apakah dokumen mengandung contradiction?
Apakah generated doc siap direview?
Apakah aman dipublish?
Apakah ada doc debt?
Apakah docs ini membantu human dan agent?

Target part ini:

mendesain dimensi kualitas dokumentasi,
membuat gates untuk accuracy, completeness, freshness, traceability, style, duplication, security, dan review readiness,
membuat claim-level verification model,
membuat doc debt scoring,
membuat evaluasi otomatis dan human review loop,
membuat quality reports yang actionable,
menghubungkan quality gates ke pipeline generation, stale detection, dan memory.

2. Kenapa Quality Gates Wajib

AI dapat mempercepat dokumentasi, tetapi juga mempercepat produksi dokumentasi buruk.

2.1 Dokumentasi Buruk Lebih Berbahaya daripada Tidak Ada Dokumentasi

Dokumentasi kosong membuat engineer sadar harus membaca source.

Dokumentasi salah membuat engineer dan agent percaya pada informasi palsu.

Contoh buruk:

The service guarantees exactly-once event processing.

Jika source hanya menunjukkan Kafka consumer biasa tanpa idempotency evidence, claim ini berbahaya.

2.2 Quality Gates sebagai Trust Boundary

Generated docs harus melewati boundary:

Quality gate bukan pengganti review manusia, tetapi filter awal yang membuat review lebih efisien.

3. Quality Dimensions

3.1 Universal Dimensions

Dimension	Pertanyaan
accuracy	Apakah claim sesuai evidence?
completeness	Apakah section wajib ada?
traceability	Apakah claim bisa dilacak ke source?
freshness	Apakah source evidence masih current?
scope correctness	Apakah dokumen tidak melebar keluar target?
clarity	Apakah jelas untuk audience?
structure	Apakah template dipenuhi?
consistency	Apakah istilah/claim konsisten?
duplication	Apakah mengulang docs lain tanpa perlu?
safety	Apakah tidak membocorkan secret/private info?
review readiness	Apakah siap direview?
agent usefulness	Apakah usable untuk agent context jika perlu?

3.2 Quality bukan Satu Angka

Satu score total membantu prioritization, tetapi review butuh detail.

quality:
  accuracy: pass
  completeness: warn
  traceability: pass
  freshness: pass
  safety: pass
  reviewReadiness: warn
  overall: pass_with_warnings

4. Quality Gate Architecture

4.1 Gate Inputs

generated document,
doc request,
doc type template,
context pack,
evidence map,
claim list,
graph snapshot,
source snapshot,
freshness report,
permission context,
style guide.

4.2 Gate Outputs

pass/fail/warn,
quality report,
unsupported claims,
contradicted claims,
missing sections,
stale evidence,
security findings,
repair suggestions,
reviewer checklist.

5. Accuracy Gate

Accuracy gate memastikan claim sesuai evidence.

5.1 Claim-Level Accuracy

Claim:

`OrderService.createOrder` validates the request before saving the order. [E1][G1]

Check:

E1 exists,
G1 exists,
graph path supports validate-before-save ordering,
source snapshot matches,
no stronger contradictory evidence.

5.2 Accuracy Status

Status	Meaning
`supported`	evidence supports claim
`unsupported`	no evidence
`contradicted`	evidence refutes claim
`uncertain`	evidence weak/ambiguous
`not_evaluable`	claim too vague
`out_of_scope`	claim outside target scope

5.3 Accuracy Report

accuracy:
  status: fail
  supportedClaims: 18
  unsupportedClaims:
    - claim: "Validation rules are loaded from database."
      reason: "No cited evidence supports database-backed rules."
  contradictedClaims:
    - claim: "Endpoint is POST /order."
      contradicts: "OpenAPI and route graph show POST /orders."

5.4 Accuracy Gate Rule

gate:
  name: accuracy
  passIf:
    unsupportedClaims: 0
    contradictedClaims: 0
  warnIf:
    uncertainClaimsBelow: 3

For internal drafts, uncertain claims may pass if clearly marked.

6. Traceability Gate

Traceability gate memastikan dokumen bisa diaudit.

6.1 Required Traceability

Every major factual claim should cite:

file span,
document span,
schema pointer,
graph edge/path,
memory with original evidence,
human review if relevant.

6.2 Traceability Levels

Level	Description
line-level	file path + line range
section-level	doc section evidence
artifact-level	file/doc only
derived	graph/memory with underlying evidence
none	no traceability

Line-level is best for code facts.

6.3 Traceability Report

traceability:
  totalClaims: 24
  claimsWithLineEvidence: 19
  claimsWithArtifactOnly: 3
  claimsWithoutEvidence: 2
  score: 0.79

6.4 Gate Rule

passIf:
  claimsWithoutEvidence: 0
  lineLevelTraceabilityForCodeClaims: ">= 0.80"

7. Completeness Gate

Completeness gate mengecek apakah doc type memenuhi struktur minimum.

7.1 Module Doc Completeness

Required:

purpose,
scope/boundary,
main components,
control flow,
related tests,
evidence,
uncertainty/freshness.

7.2 API Doc Completeness

Required:

method/path,
contract source,
request schema,
response schema,
handler,
error behavior or uncertainty,
tests or absence report,
evidence.

7.3 Runbook Completeness

Required:

symptoms,
diagnosis,
mitigation,
rollback or absence,
escalation,
verification,
freshness.

7.4 Completeness Report

completeness:
  docType: module_doc
  requiredSections:
    Purpose: present
    Scope: present
    MainComponents: present
    ControlFlow: present
    RelatedTests: missing
    Evidence: present
    Uncertainties: present
  status: warn

7.5 Missing Evidence vs Missing Section

Jika tests tidak ada di repo, section tetap harus menyebut:

## Related Tests

No tests linked to this module were found in the indexed snapshot.

Itu lebih baik daripada menghilangkan section.

8. Freshness Gate

Freshness gate memastikan docs merepresentasikan source version yang jelas.

8.1 Freshness Inputs

source commit,
current commit,
evidence file hashes,
graph diff,
doc generation timestamp,
template version,
context pack ID,
review timestamp.

8.2 Freshness Status

Status	Meaning
current	evidence unchanged
partially_stale	some source changed
stale	key evidence changed
invalid	referenced symbol/dependency deleted
unknown	no evidence map

8.3 Freshness Report

freshness:
  status: partially_stale
  sourceCommit: 6f41ab2
  currentCommit: 9ab812c
  changedEvidence:
    - path: OrderValidator.java
      impact: section_control_flow
  recommendedAction: regenerate_affected_sections

8.4 Gate Rule

For new generated doc:

passIf:
  sourceCommitPresent: true
  evidenceMapPresent: true
  staleEvidenceUsedAsPrimary: false

For existing doc health:

failIf:
  referencedSymbolDeleted: true

9. Scope Correctness Gate

Docs must not overclaim outside target.

9.1 Scope Violations

Module doc for order.validation should not claim:

The billing service charges customer cards.

unless cross-module context is explicitly in scope.

9.2 Scope Checks

cited evidence belongs to target scope or allowed related scope,
sections do not introduce unrelated modules,
graph expansion depth within policy,
multi-repo evidence labeled.

9.3 Scope Report

scope:
  requested:
    type: module
    path: src/main/java/com/acme/order/validation
  outOfScopeClaims:
    - claim: "Billing service charges customer cards."
      reason: "No allowed cross-repo scope."

10. Conflict Gate

Conflict gate detects contradictions.

10.1 Conflict Types

Type	Example
doc vs source	doc says `RuleEngine`, source has `RuleRegistry`
doc vs contract	doc says `/order`, OpenAPI says `/orders`
generated section vs another section	section A says sync, section B says async
memory vs source	memory refers to deleted symbol
stale docs vs current graph	old flow differs

10.2 Conflict Report

conflicts:
  - type: doc_vs_code
    severity: high
    claim: "Validation rules are loaded from database."
    evidence:
      currentSource: "RuleRegistry uses in-memory registered rules."
    action: revise_claim

10.3 Gate Rule

High severity contradictions fail.

Medium contradictions can pass only if clearly marked as uncertainty.

11. Style and Structure Gate

Docs must be readable and consistent.

11.1 Style Checks

headings follow template,
no excessive verbosity,
no marketing language,
no unsupported adjectives,
consistent terminology,
code identifiers formatted correctly,
tables readable,
Mermaid valid if included.

11.2 Bad Style

This amazing service robustly handles all order use cases using best practices.

Problems:

vague,
unsupported,
marketing language.

11.3 Better Style

`OrderService.createOrder` validates the request through `OrderValidator.validate` before calling `OrderRepository.save`. [E1]

11.4 Style Report

style:
  status: warn
  issues:
    - type: unsupported_adjective
      text: "robustly"
    - type: vague_statement
      text: "handles everything related to orders"

12. Duplication Gate

Duplicate docs create maintenance risk.

12.1 Duplication Types

Type	Example
exact duplicate	same paragraph repeated
semantic duplicate	two docs explain same module
conflicting duplicate	two docs explain same flow differently
generated duplicate	generator creates new doc instead of updating old
copy-pasted source	docs paste large code blocks unnecessarily

12.2 Detection Signals

same scope,
same title,
same linked symbols,
high text similarity,
same evidence map,
same doc type.

12.3 Duplication Report

duplication:
  status: warn
  similarDocs:
    - path: docs/order-validation.md
      similarity: 0.84
      sameScope: true
  recommendation: update_existing_doc_instead_of_create_new

12.4 Gate Rule

If same scope and doc type already exists, prefer update/regeneration over new doc creation.

13. Security Gate

Documentation must not leak sensitive information.

13.1 Security Checks

no secret values,
no blocked-sensitive chunks,
generated doc visibility no broader than source,
no unauthorized cross-repo evidence,
redaction applied,
memory not leaking private source,
operational docs do not expose dangerous commands without review.

13.2 Security Finding

security:
  status: fail
  findings:
    - type: secret_like_value
      section: Configuration
      action: redact

13.3 Gate Rule

Security failure blocks publication.

No warning-only for secret leakage.

14. Review Readiness Gate

A doc can be accurate but not reviewable.

14.1 Review Readiness Checks

doc has owner/reviewer,
quality report present,
evidence map present,
diff present if updating existing doc,
unsupported claims listed,
warnings clear,
source commit present,
generated status visible.

14.2 Review Package

reviewPackage:
  doc: generated/order-validation.md
  qualityReport: generated/order-validation.quality.yaml
  evidenceMap: generated/order-validation.evidence.json
  contextPack: ctx_01J
  suggestedReviewers:
    - team-order-platform

14.3 Gate Rule

If no reviewer can be determined for required-review doc types, mark review readiness fail/warn depending policy.

15. Agent Usefulness Gate

Some docs are intended for AI agents.

15.1 Agent Context Quality

Check:

exact symbols included,
constraints explicit,
tests listed,
memory separated from source,
stale docs excluded,
tool boundaries included,
output is compact enough.

15.2 Agent Doc Anti-Pattern

Bad:

This module is important and has many validation responsibilities...

Good:

target: OrderValidator.validate
mustInspect:
  - OrderValidator.java
  - RuleRegistry.java
relatedTests:
  - OrderValidatorTest
prohibitedActions:
  - edit generated OpenAPI client

15.3 Gate Rule

Agent docs should fail if they lack exact target references.

16. Doc Debt Scoring

Doc debt is accumulated documentation risk.

16.1 Doc Debt Inputs

Signal	Debt
stale critical doc	high
missing runbook for critical service	high
missing API docs	medium/high
unreviewed generated docs	medium
duplicate/conflicting docs	medium/high
low evidence coverage	medium
no owner	medium
missing tests section	low/medium
old README	low/medium

16.2 Doc Debt Score

docDebt:
  repositoryId: order-service
  score: 72
  band: high
  reasons:
    - "Runbook missing"
    - "2 module docs stale"
    - "API docs missing for 3 endpoints"
    - "Generated docs pending review"

16.3 Score Formula Example

docDebtScore =
    staleCriticalDocs * 20
  + missingRequiredDocs * 15
  + conflictedDocs * 15
  + unreviewedGeneratedDocs * 8
  + lowTraceabilityDocs * 6
  + ownerlessDocs * 5

Use configurable weights.

16.4 Use Doc Debt

Use it to:

prioritize documentation work,
alert owners,
block release if policy requires,
choose regeneration candidates,
measure platform impact.

17. Automated Evaluation

17.1 Evaluation Levels

Level	Method
syntax	markdown/frontmatter/links
structure	required sections
citation	citation IDs valid
evidence	claims supported
freshness	source unchanged
style	linting
security	secret scanning
semantic	claim verification
human	reviewer feedback

17.2 Golden Doc Eval

For known repo fixture, expected docs should include:

mustMention:
  - OrderValidator
  - RuleRegistry
  - OrderValidatorTest
mustNotMention:
  - OrderRuleEngine
mustCite:
  - OrderValidator.java

17.3 Regression Tests

Run eval after changes to:

prompt template,
retrieval ranking,
context assembly,
claim verifier,
parser,
graph builder.

17.4 Metrics

Metric	Meaning
evidence coverage	claims with evidence
unsupported claim rate	hallucination risk
contradiction rate	correctness risk
stale evidence rate	freshness
review approval rate	human acceptance
revision count	review friction
doc debt trend	overall health
agent task success	downstream usefulness

18. Human Review Loop

18.1 Reviewer Feedback Types

claim incorrect,
missing component,
too verbose,
wrong audience,
stale evidence,
bad structure,
good output,
memory candidate approved,
memory candidate rejected.

18.2 Feedback Schema

reviewFeedback:
  documentId: doc_01J
  reviewer: team-order-platform
  decision: request_changes
  comments:
    - section: Control Flow
      issue: "Missing RuleRegistryTest."
      severity: medium
  reusableLessons:
    - "Validation module docs should include RuleRegistryTest."

18.3 Feedback to Evaluation Memory

Review feedback can create evaluation memory candidate.

memoryCandidate:
  type: evaluation_lesson
  statement: "Validation module docs should include RuleRegistryTest when present."
  evidence:
    - reviewFeedbackId: review_01J

But do not auto-approve without policy.

19. Quality Report Format

19.1 Summary

qualityReport:
  documentId: docgen_01J
  status: pass_with_warnings
  overallScore: 0.84
  docType: module_doc
  sourceCommit: 6f41ab2

19.2 Dimensions

dimensions:
  accuracy:
    status: pass
    score: 0.91
  completeness:
    status: warn
    missing:
      - related_tests_section_content
  traceability:
    status: pass
    score: 0.88
  freshness:
    status: pass
  security:
    status: pass

19.3 Action Items

actions:
  - type: review
    message: "Confirm uncertainty about retry behavior."
  - type: improve_docs
    message: "Add tests for RuleRegistry or document absence."

19.4 User-Facing Summary

Quality: Pass with warnings

- Evidence coverage: good
- Unsupported claims: 0
- Missing evidence: retry behavior
- Review required: yes

20. Quality Gate Implementation

20.1 Interface

public interface DocumentationQualityGate {
    QualityGateResult evaluate(QualityGateInput input);
}

20.2 Input

public record QualityGateInput(
    GeneratedDocument document,
    DocumentationRequest request,
    DocumentationTemplate template,
    EvidenceMap evidenceMap,
    ContextPack contextPack,
    GraphSnapshot graphSnapshot,
    Principal principal
) {}

20.3 Result

public record QualityGateResult(
    QualityStatus status,
    List<QualityFinding> findings,
    QualityScore score,
    List<RecommendedAction> actions
) {}

20.4 Gate Chain

List<DocumentationQualityGate> gates = List.of(
    new StructureGate(),
    new CitationGate(),
    new ClaimAccuracyGate(),
    new FreshnessGate(),
    new ConflictGate(),
    new SecurityGate(),
    new StyleGate(),
    new ReviewReadinessGate()
);

21. CI/CD Integration

Docs quality gates can run in CI.

21.1 Use Cases

PR modifies code but docs stale,
PR modifies docs with unsupported claim,
generated docs missing evidence,
runbook changed without owner review,
API change without API docs refresh.

21.2 CI Report

documentationCheck:
  status: warn
  findings:
    - "Order validation docs may be stale due to changed OrderValidator.java"
    - "API docs missing for new POST /orders/bulk endpoint"

21.3 Blocking Policy

Do not block every warning.

Block for:

secret leakage,
contradicted critical docs,
missing required compliance/runbook docs,
broken links in official docs,
high-risk stale docs for changed critical APIs.

22. Common Mistakes

22.1 Only Checking Grammar

Good grammar does not mean correct docs.

22.2 No Claim-Level Verification

Section-level quality is too coarse.

22.3 Ignoring Freshness

A perfect doc from old commit can be wrong.

22.4 No Review State

Readers need to know if doc is generated/unreviewed.

22.5 Publishing Warnings as Hidden Metadata

Warnings should be visible to reviewers.

22.6 No Security Gate

Docs can leak secrets and architecture.

22.7 One Quality Score for Everything

API docs and ADRs require different gates.

22.8 No Feedback Loop

Human review should improve future generation/eval.

23. Practical Exercise

Build quality gates for module docs.

23.1 Input

Use generated doc:

generated/order-validation.md
evidence-map.json
context-pack.md
graph-snapshot.json

23.2 Output

Produce:

quality-report.yaml
claim-verification.yaml
doc-debt-report.yaml
review-checklist.md

23.3 Required Gates

structure gate,
citation gate,
accuracy gate,
freshness gate,
conflict gate,
security gate,
review readiness gate.

23.4 Acceptance Criteria

unsupported claims detected,
missing required sections reported,
stale evidence reported,
secret-like content blocked,
doc debt score computed,
review checklist generated,
pass/fail status explicit.

24. Summary

Doc quality gates are the trust boundary for AI-generated documentation.

Key points:

quality is multi-dimensional,
accuracy requires claim-level evidence,
traceability is mandatory for trust,
completeness depends on doc type,
freshness must be tied to source evidence,
conflicts must be detected and surfaced,
security failures block publication,
docs need review readiness artifacts,
doc debt scoring helps prioritization,
evaluation and human feedback should continuously improve the platform.

Part berikutnya membahas Multi-Repository Documentation: bagaimana menghasilkan docs lintas repository, service, event, API, ownership, dan dependency tanpa merusak permission, provenance, dan version alignment.

Lesson Recap

You just completed lesson 19 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 18

Learn Ai Code Documentation Agent Memory Part 018 Code To Doc Generation Pipeline

Next Lesson

Lesson 20

Learn Ai Code Documentation Agent Memory Part 020 Multi Repository Documentation