Final StretchOrdered learning track

Learn Build From Scratch Recommendations System Part 075 Minimum Production Skeleton

[]9 min read1732 words

In This Lesson

1. Mental Model: Build the Production Shape Before the Fancy Model 2. First Production Slice 3. Skeleton Architecture

Lesson 7580 lesson track67–80 Final Stretch

title: Build From Scratch Recommendations System - Part 075 description: Mendesain minimum production skeleton untuk recommendation system from scratch: service boundaries, repository structure, OpenAPI contracts, event schemas, PostgreSQL/Redis/Kafka integration, candidate/ranking/reranking skeleton, feature/profile stores, observability, CI/CD, and first production slice. series: learn-build-from-scratch-recommendations-system seriesTitle: Build From Scratch: Enterprise Recommendations System order: 75 partTitle: Minimum Production Skeleton tags:

recommendation-system
recsys
build-from-scratch
production-skeleton
java
system-design
series date: 2026-07-02

Part 075 — Minimum Production Skeleton

Mulai Part 075, kita masuk Module 10: Build From Scratch Implementation Tracks.

Setelah membahas mental model, data, candidate generation, ranking, reranking, serving, MLOps, observability, governance, security, privacy, safety, dan operating model, sekarang kita membangun bentuk minimal yang realistis.

Bukan dummy project.

Bukan playground.

Bukan “recommend item random dari array”.

Minimum production skeleton berarti:

kecil enough untuk dibangun,
tetapi punya boundary, contract, event, observability, fallback, config, dan lifecycle yang benar.

Skeleton ini belum harus punya deep learning canggih. Tetapi ia harus punya struktur yang bisa tumbuh menjadi platform production-grade.

Part ini membahas blueprint minimum production skeleton untuk recommendation system: service boundaries, repository structure, OpenAPI contracts, event schemas, PostgreSQL/Redis/Kafka, candidate/ranking/reranking skeleton, feature/profile stores, decision logging, observability, CI/CD, and first production slice.

1. Mental Model: Build the Production Shape Before the Fancy Model

Kesalahan umum:

build model first, platform later

Untuk production RecSys, lebih aman:

build decision platform skeleton first,
then improve candidate/ranking quality iteratively

Minimum skeleton should support:

request/response contract,
candidate source plugin,
eligibility filtering,
basic ranking,
slate construction,
tracking tokens,
decision logging,
event ingestion,
feature/profile minimal,
observability,
fallback,
config-driven behavior,
testing.

Jika bentuk ini benar, model bisa diganti bertahap.

2. First Production Slice

Target first slice:

Home recommendations for known users and anonymous users

Capabilities:

personalized if profile exists,
contextual fallback if no profile/consent,
item eligibility,
simple ranker,
diversity/frequency basic,
decision log,
impression/click feedback,
dashboard,
safe fallback.

Not included initially:

deep neural ranker,
real-time two-tower training,
advanced bandit,
complex LLM agent,
full multi-tenant enterprise config,
automated retraining.

Start correct, not bloated.

3. Skeleton Architecture

Minimum can be modular monolith first, but preserve module boundaries.

4. Start Modular Monolith or Microservices?

For first skeleton, prefer modular monolith or small service set.

Why?

faster iteration,
easier debugging,
fewer distributed failures,
simpler deployment,
lower ops overhead.

But design modules as if they can split later:

rec-api
candidate
eligibility
ranking
slate
profile
feature
events
config

Do not create 12 microservices before product fit.

5. Suggested Repository Structure

recommendation-platform/
  README.md
  docs/
    architecture/
    api/
    runbooks/
  contracts/
    openapi/
      recommendation-api.yaml
    events/
      decision-log.avsc
      impression-event.avsc
      action-event.avsc
  services/
    rec-api/
    event-ingestion/
    batch-jobs/
  libs/
    rec-domain/
    rec-contracts/
    rec-observability/
    rec-testing/
  infrastructure/
    docker-compose.yml
    local/
    migrations/
  pipelines/
    feature-jobs/
    dataset-builder/
  dashboards/
  scripts/

Even if one JVM app, keep contracts and domain modules clean.

6. Java Module Layout

Example Maven/Gradle modules:

rec-domain
rec-api-contract
rec-application
rec-candidate
rec-eligibility
rec-ranking
rec-slate
rec-profile
rec-feature
rec-events
rec-config
rec-observability
rec-infra-postgres
rec-infra-redis
rec-infra-kafka
rec-service

Domain should not depend on infrastructure.

Keep business logic testable.

7. Core Domain Objects

Minimum domain:

public record RecommendationRequest(
    String requestId,
    Subject subject,
    Surface surface,
    RequestContext context,
    int limit
) {}

public record Candidate(
    String itemId,
    String itemType,
    List<SourceEvidence> sources,
    Map<String, Object> attributes
) {}

public record RankedCandidate(
    Candidate candidate,
    double score,
    Map<String, Double> scoreComponents
) {}

public record RecommendationSlate(
    String slateId,
    List<RecommendationItem> items,
    SlateMetadata metadata
) {}

Use explicit types for request, candidate, ranked candidate, and slate.

8. API Contract

OpenAPI endpoint:

POST /v1/recommendations/{surface}

Request:

{
  "request_id": "req_001",
  "subject": {
    "user_id": "u123",
    "anonymous_id": "anon_456",
    "session_id": "sess_789",
    "tenant_id": "default"
  },
  "context": {
    "region": "ID",
    "locale": "id-ID",
    "device_type": "mobile",
    "privacy_mode": "personalized"
  },
  "limit": 20,
  "debug": false
}

Response:

{
  "request_id": "req_001",
  "slate_id": "slate_abc",
  "items": [
    {
      "item_id": "item_123",
      "position": 1,
      "tracking_token": "opaque-token",
      "reason_codes": ["popular_in_category"]
    }
  ],
  "metadata": {
    "model_version": "baseline_ranker_v1",
    "policy_version": "home_slate_v1",
    "fallback_used": false
  }
}

9. Tracking Token

Tracking token should encode or reference:

request_id
slate_id
impression_id
item_id
position
surface
model version
policy version
experiment variant

Token should be opaque to client.

Use signed token or server-side lookup.

Do not trust client-supplied item/position without validation if used for training.

10. Event Contracts

Minimum events:

decision log,
impression event,
action event,
item catalog event,
user feedback/suppression event.

Impression event:

{
  "event_id": "evt_001",
  "request_id": "req_001",
  "slate_id": "slate_abc",
  "impression_id": "imp_001",
  "item_id": "item_123",
  "position": 1,
  "surface": "home_feed",
  "user_id": "u123",
  "event_time": "2026-07-02T10:00:00Z",
  "tracking_token": "opaque-token"
}

Action event:

{
  "event_id": "evt_002",
  "impression_id": "imp_001",
  "action_type": "click",
  "event_time": "2026-07-02T10:01:00Z"
}

11. Decision Log

Decision log captures system decision.

Fields:

request_id
slate_id
surface
subject hash/id
context
candidate counts by source
filter counts/reasons
model version
policy version
experiment variants
final slate item IDs
scores sampled or full if allowed
fallback reason
latency by stage

Decision log is internal.

It powers:

debugging,
training,
attribution,
replay,
observability.

12. Data Stores Minimum

Use:

PostgreSQL

For:

catalog snapshot,
item metadata,
config metadata,
decision log query index if needed,
event ingestion checkpoint,
batch outputs small scale.

Redis

For:

profile/session state,
cache,
fallback lists,
frequency counters,
suppression state small scale.

Kafka

For:

decision log stream,
impression/action events,
catalog events,
profile update stream.

This stack can grow.

13. Catalog Table

Minimum table:

CREATE TABLE rec_items (
    item_id TEXT PRIMARY KEY,
    item_type TEXT NOT NULL,
    title TEXT,
    category_id TEXT,
    creator_id TEXT,
    region TEXT,
    language TEXT,
    active BOOLEAN NOT NULL,
    recommendable BOOLEAN NOT NULL,
    quality_score DOUBLE PRECISION,
    created_at TIMESTAMPTZ NOT NULL,
    updated_at TIMESTAMPTZ NOT NULL
);

Do not recommend if active=false or recommendable=false.

14. User Profile Table/Cache

Simple profile:

{
  "user_id": "u123",
  "top_categories": {
    "camera": 0.8,
    "laptop": 0.4
  },
  "recent_item_ids": ["item_1", "item_2"],
  "updated_at": "2026-07-02T09:50:00Z"
}

Initial profile can be built from recent clicks/views.

Store in Redis for serving.

Later, move to proper profile store/feature store.

15. Feature Store Minimum

Do not overbuild feature store initially.

Start with:

item_features table/cache
user_profile cache
request context features
candidate source features

Feature object:

public record FeatureValue(
    String name,
    Object value,
    boolean missing,
    String missingReason
) {}

Add feature registry metadata from day one.

Even if simple.

16. Candidate Sources Minimum

Implement three sources:

Popular/trending by region/category.
Content/category-based from user profile.
Similar-to-recent-item.

Candidate source interface:

public interface CandidateSource {
    String name();

    CandidateSourceResult generate(CandidateRequest request);
}

Result:

public record CandidateSourceResult(
    String sourceName,
    List<Candidate> candidates,
    CandidateSourceDiagnostics diagnostics
) {}

17. Popular Candidate Source

SQL example:

SELECT item_id, quality_score
FROM rec_items
WHERE active = true
  AND recommendable = true
  AND region = :region
ORDER BY quality_score DESC, created_at DESC
LIMIT :limit;

Better later:

smoothed CTR,
trust-weighted popularity,
time decay,
segment trending.

But start safe and deterministic.

18. Profile Category Candidate Source

If user profile has categories:

camera: 0.8
laptop: 0.4

Fetch active items in those categories.

Score source evidence:

profile_category_affinity * item_quality

This creates simple personalization.

If profile missing, source returns empty.

19. Similar Recent Item Source

Use item metadata/category:

recent clicked item category -> similar items from same category

Later replace with item-to-item co-occurrence or embedding.

For skeleton, category similarity is enough.

20. Candidate Aggregation

Aggregate:

run sources in parallel if possible,
merge candidates,
dedup by item_id,
preserve source evidence,
cap candidate count.

public final class CandidateAggregator {
    public List<Candidate> merge(List<CandidateSourceResult> sourceResults) {
        Map<String, Candidate> byItem = new LinkedHashMap<>();

        for (CandidateSourceResult result : sourceResults) {
            for (Candidate candidate : result.candidates()) {
                byItem.merge(
                    candidate.itemId(),
                    candidate,
                    Candidate::mergeEvidence
                );
            }
        }

        return new ArrayList<>(byItem.values());
    }
}

21. Eligibility Filter Minimum

Rules:

active item
recommendable item
region match
language if needed
not suppressed
not recently seen
not duplicate

Filter result:

public record FilterDecision(
    String itemId,
    boolean allowed,
    String reasonCode
) {}

Log rejection counts.

22. Suppression Minimum

Support:

hidden item,
blocked creator,
recently seen item.

Redis keys:

user:{userId}:hidden_items
user:{userId}:blocked_creators
user:{userId}:seen_items_7d

Apply before ranking.

User controls should work quickly.

23. Ranking Minimum

Start with heuristic ranker.

Score:

score =
  0.50 * source_score
  + 0.30 * item_quality_score
  + 0.20 * profile_category_match
  - 0.30 * seen_penalty
  - 0.50 * low_quality_penalty

This is transparent, debuggable, and safe.

Do not start with black-box model if platform cannot debug yet.

24. Ranker Interface

public interface RankingService {
    RankingResult rank(RankingRequest request);
}

public record RankingRequest(
    List<Candidate> candidates,
    Subject subject,
    RequestContext context,
    FeatureBundle features,
    RankingConfig config
) {}

public record RankingResult(
    List<RankedCandidate> ranked,
    RankingDiagnostics diagnostics
) {}

Later replace heuristic with GBDT/deep model without changing orchestration.

25. Reranking Minimum

Rerank for:

no duplicate item,
max same category,
max same creator,
final limit,
optional exploration slot,
final eligibility check.

Greedy selection:

iterate ranked candidates
skip if violates hard slate rule
add until limit

This is enough for first production slice.

26. Slate Policy Config

surface: home_feed
limit: 20
max_same_category: 5
max_same_creator: 3
min_quality_score: 0.2
allow_exploration: false
fallback_policy: home_fallback_v1

Keep as config, not hardcoded.

27. Fallback Minimum

Fallback hierarchy:

personalized candidates
-> regional popular
-> editorial safe
-> empty safe

Fallback should still pass eligibility.

Log fallback reason.

Fallback is not optional.

28. Config Store Minimum

Use YAML or DB-backed config.

Config types:

surface config
candidate source config
ranking config
slate policy
fallback policy
feature set

Version configs:

home_surface_v1
baseline_ranker_v1
home_slate_v1

Log versions in response metadata and decision log.

29. Observability Minimum

Metrics:

request count
latency p50/p95/p99
candidate count by source
filter rejection by reason
ranker latency
final slate size
fallback rate
empty slate rate
decision log success
impression/click event volume

Logs:

structured request summary,
decision log,
error logs.

Traces:

stage spans.

30. Debug Endpoint

Internal only:

GET /internal/debug/recommendations/{request_id}

Returns:

context,
candidates by source,
filter reasons,
key features,
scores,
reranking decisions,
final slate,
fallback.

Must be access-controlled and redacted.

31. Local Development Environment

Docker Compose:

postgres
redis
kafka
recommendation-service
event-ingestion-service

Seed data:

items,
profiles,
configs.

Scripts:

load sample catalog
simulate impressions/clicks
run local recommendation request

Good local environment accelerates learning.

32. CI Pipeline

CI checks:

compile
unit tests
contract tests
schema compatibility
migration tests
static analysis
container build
integration tests with testcontainers

Contract tests for API/events are important.

Do not break tracking event schema casually.

33. CD Pipeline

CD stages:

build artifact
run tests
deploy to staging
run smoke tests
shadow/canary
deploy production
monitor
rollback capability

Even skeleton should have rollback.

34. Testing Strategy

Tests:

Unit

candidate aggregation,
filter rules,
ranking score,
reranking policy,
tracking token generation.

Integration

DB/Redis/Kafka,
API response,
event emission.

Contract

OpenAPI,
event schemas.

Regression

hidden item not recommended,
inactive item not recommended,
fallback works.

35. Load Test Minimum

Test:

100 QPS
500 QPS
cold cache
candidate source timeout
Redis unavailable
ranker exception
Kafka unavailable

Measure:

latency,
fallback,
error rate,
decision logging.

Even small skeleton should know its failure behavior.

36. Event Ingestion Minimum

Pipeline:

Kafka impression/action event
-> validate schema
-> dedup by event_id
-> store clean event
-> update profile/session/suppression if needed

For skeleton, profile update can be simple:

on click -> increment category affinity
on impression -> add to seen set
on hide -> add item to hidden set

37. Profile Update Minimum

Pseudo-code:

public void handleClick(ActionEvent event) {
    Item item = catalog.get(event.itemId());
    profileStore.incrementCategoryAffinity(event.userId(), item.categoryId(), 1.0);
    profileStore.addRecentItem(event.userId(), event.itemId());
}

Use decay later.

38. Decision Logging Minimum

Emit asynchronously.

If Kafka fails:

buffer if possible,
metric/alert,
do not block response unless compliance requires.

Decision log completeness metric:

decision_log_success_rate

39. Privacy Minimum

Implement:

privacy mode in request,
personalized vs non-personalized path,
no profile fetch in non-personalized mode,
debug redaction,
user hide/suppression.

Tests:

non_personalized request does not use profile
hidden item excluded

40. Safety Minimum

Implement:

item active/recommendable flags,
policy denylist/tombstone,
final eligibility check,
quality floor,
report/hide negative feedback.

Safety should not wait for advanced classifier.

41. Security Minimum

Implement:

API authentication or gateway assumption,
internal debug authorization,
tenant_id propagation if enterprise,
config/admin restricted,
no raw score exposure externally,
audit for debug access.

Even internal prototypes leak if debug endpoints open.

42. Minimal Database Migrations

Tables:

rec_items
rec_surface_config
rec_decision_log_index
rec_event_ingestion_checkpoint
rec_fallback_items
rec_experiment_assignment_optional

Events can go to Kafka/log storage; Postgres can index metadata for debugging.

43. Minimal Dashboard

Dashboard panels:

QPS
latency p95/p99
candidate count by source
filter rejection by reason
fallback rate
empty slate rate
top categories in slate
decision log success
impression/click volume
CTR basic

This is enough to operate first slice.

44. First Release Plan

Phase 1:

internal/staging only

Phase 2:

1% traffic with safe fallback

Phase 3:

A/B against existing baseline/editorial

Phase 4:

gradual rollout

Do not launch full traffic without observability/fallback.

45. What Not to Build Yet

Avoid initially:

full feature store platform
deep ranker
bandit optimizer
complex multi-objective utility
full tenant admin UI
custom workflow DSL
LLM autonomous recommender
real-time retraining
massive microservice split

Build hooks/interfaces for future, not full complexity.

46. Production Readiness Gate for Skeleton

Gate:

contract stable
fallback works
hidden/inactive items excluded
decision logs emitted
impressions/clicks tracked
latency within SLO
dashboard live
debug trace works
rollback exists
privacy mode works
on-call/runbook exists

If any missing, not production-ready.

47. Common Skeleton Failure Modes

47.1 Too Much ML, Too Little Platform

Cannot debug/operate.

47.2 No Event Tracking

Cannot learn.

47.3 No Fallback

Outage on dependency failure.

47.4 No Filter Reasons

Bad recs impossible to debug.

47.5 No Privacy Mode

Personalization cannot be governed.

47.6 No Config Versioning

Behavior untraceable.

47.7 No Decision Log

Training/debugging broken.

47.8 No Local Environment

Slow iteration.

47.9 Microservices Too Early

Ops overhead.

47.10 No Regression Tests

Safety bugs repeat.

48. Implementation Milestone Checklist

Milestone A — Contracts

[ ] OpenAPI recommendation endpoint
[ ] event schemas
[ ] domain objects

Milestone B — Serving Core

[ ] candidate sources
[ ] eligibility
[ ] ranker
[ ] reranker
[ ] fallback

Milestone C — Feedback Loop

[ ] decision log
[ ] impression/action events
[ ] profile update

Milestone D — Operations

[ ] metrics
[ ] traces
[ ] debug endpoint
[ ] dashboard
[ ] load test

Milestone E — Governance Basics

[ ] privacy mode
[ ] safety denylist
[ ] security for debug
[ ] config versioning

49. Minimal Production Skeleton Summary

Skeleton should deliver:

POST /recommendations/home_feed

with:

multi-source candidates,
eligibility filtering,
simple transparent ranking,
greedy slate policy,
fallback,
tracking tokens,
decision logging,
impression/click tracking,
basic profile update,
privacy-aware path,
observability/debugging,
tests/deploy/rollback.

This is the foundation for all advanced RecSys work.

50. Kesimpulan

Minimum production skeleton adalah bentuk terkecil yang masih punya DNA production-grade.

Prinsip utama:

Build production shape before fancy model.
Start with a narrow first production slice.
Preserve module boundaries even in modular monolith.
Contracts, events, and decision logs are foundational.
Candidate/ranking/reranking should be replaceable modules.
Fallback, filter reasons, and debug traces are mandatory.
Privacy/safety/security basics must exist from the start.
Observability and CI/CD are part of the product.
Do not overbuild advanced ML before feedback loop works.
Skeleton should be small, safe, operable, and extensible.

Di Part 076, kita akan membangun track konkret: Ecommerce Recommendation System — mapping semua konsep ke domain e-commerce: home, PDP, cart, checkout, email/push, cold-start products, sellers, inventory, promotions, returns, and marketplace health.

Lesson Recap

You just completed lesson 75 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 74

Learn Build From Scratch Recommendations System Part 074 Operating Model And Team Topology

Next Lesson

Lesson 76

Learn Build From Scratch Recommendations System Part 076 Ecommerce Recommendation System