Series MapLesson 76 / 80
Final StretchOrdered learning track

Learn Build From Scratch Recommendations System Part 076 Ecommerce Recommendation System

11 min read2005 words
PrevNext
Lesson 7680 lesson track6780 Final Stretch

title: Build From Scratch Recommendations System - Part 076 description: Mendesain ecommerce recommendation system production-grade: home feed, PDP similar items, cart complements, checkout safeguards, email/push, product/seller/catalog modeling, inventory, pricing, promotions, returns, cold-start, ranking objectives, marketplace health, and implementation blueprint. series: learn-build-from-scratch-recommendations-system seriesTitle: Build From Scratch: Enterprise Recommendations System order: 76 partTitle: Ecommerce Recommendation System tags:

  • recommendation-system
  • recsys
  • ecommerce
  • marketplace
  • java
  • build-from-scratch
  • series date: 2026-07-02

Part 076 — Ecommerce Recommendation System

E-commerce adalah salah satu domain paling klasik dan paling kompleks untuk recommendation system.

Sistem harus merekomendasikan:

  • produk di home,
  • similar items di PDP,
  • complements di cart,
  • upsell/cross-sell,
  • replenishment,
  • bundles,
  • new arrivals,
  • deals/promotions,
  • email/push recommendations,
  • search/category ranking support,
  • seller/brand discovery,
  • marketplace long-tail exposure.

Tetapi e-commerce juga punya constraint keras:

  • inventory,
  • price,
  • shipping,
  • region,
  • seller trust,
  • counterfeit/fraud,
  • return/refund,
  • campaign eligibility,
  • sponsored disclosure,
  • purchased suppression,
  • privacy,
  • cold-start product/seller,
  • business margin,
  • marketplace fairness.

Part ini memetakan semua konsep sebelumnya ke domain e-commerce production-grade: domain model, surfaces, candidate sources, ranking objectives, eligibility, features, reranking, safety, experimentation, observability, and implementation blueprint.


1. E-commerce RecSys Mental Model

E-commerce RecSys adalah decision system:

Given shopper, context, surface, and constraints,
select product slate that maximizes useful shopping outcomes
while respecting availability, trust, policy, and marketplace health.

Useful outcomes bukan hanya click.

Outcomes:

product detail view
add to cart
purchase
repeat purchase
low return/refund
seller trust
customer satisfaction
margin/revenue
long-term retention

Clickbait product recommendation bisa menaikkan CTR tetapi menurunkan purchase/trust.


2. Key Surfaces

E-commerce recommendation surfaces:

SurfacePurpose
Home feeddiscovery/personalization
PDP similar itemsalternatives/substitutes
PDP frequently bought togethercomplements
Cart recommendationscross-sell/complements
Checkout recommendationsvery strict, low-risk
Search/category side modulesdiscovery within intent
Emailscheduled reactivation
Pushhigh-confidence timely nudges
Order confirmationreplenishment/next-step
Account pagereorder/replenishment
Seller/brand pageseller catalog discovery

Each surface needs different objective and safety strictness.


3. Product Entity Model

E-commerce item can be complex.

Entities:

product
sku
variant
offer
seller listing
brand
category
bundle
promotion
campaign
inventory location
shipping option

Recommendation unit may be:

  • product,
  • SKU,
  • offer,
  • listing,
  • bundle.

Choose carefully.

Example:

recommend product family on home,
recommend SKU/offer at checkout.

4. Product vs SKU vs Offer

Product

Conceptual item.

iPhone 15

SKU/Variant

Specific variation.

iPhone 15, 128GB, Blue

Offer/Listing

Seller-specific purchasable offer.

Seller A, price X, stock Y

For recommendation, ranking may use product-level but final eligibility must validate offer-level.


5. Catalog Attributes

Core attributes:

product_id
sku_id
offer_id
seller_id
brand
category
title
description
price
currency
discount
rating
review_count
inventory
shipping_region
delivery_eta
return_policy
condition
image_quality
policy_state
created_at
updated_at

Feature quality depends on catalog quality.


6. Eligibility Rules

Hard filters:

active product
recommendable policy state
in stock
available in region
shippable to user
seller active
not counterfeit/fraud flagged
not blocked brand/seller
not already purchased if durable
campaign active if promoted
age/region restriction if applicable
price visible

Eligibility should run at candidate and final validation.

Inventory/price can change quickly.


7. Surface-Specific Eligibility

Home feed:

active, recommendable, region available, likely in stock

PDP similar:

same/related category, purchasable, not same exact SKU

Cart:

compatible with cart, available, not duplicate

Checkout:

strict inventory/price/shipping final check

Push/email:

high confidence, available at send/open time if possible

The closer to purchase, the stricter.


8. Feedback Events

E-commerce events:

impression
product_view
image_view
wishlist/save
add_to_cart
remove_from_cart
checkout_start
purchase
return
refund
review
rating
seller_follow
hide/not_interested
block_seller
price_alert
search
category_browse

Label semantics matter.

A click without purchase may be weak.
A purchase followed by return may be negative long-term.


9. Label Hierarchy

Example relevance grades:

hide/report: negative
impression no action: 0
product view: 1
wishlist/save: 2
add to cart: 3
purchase: 5
purchase kept/no return: 7
return/refund: subtract

Do not optimize only product views.

Cart and purchase signals are more valuable but delayed/sparser.


10. Attribution Windows

Example:

product_view: 30m
add_to_cart: 24h
purchase: 7d
return: 30d
repeat_purchase: 90d

Purchase attribution must handle:

  • multiple impressions,
  • search vs recommendation,
  • email vs home,
  • cart additions,
  • delayed purchase.

Define clearly.


11. Candidate Sources by Surface

Home

  • user two-tower,
  • category affinity,
  • trending by region/category,
  • new arrivals,
  • deals,
  • similar to recent views,
  • seller/brand affinity,
  • editorial.

PDP Similar

  • item-to-item co-view,
  • same category/brand,
  • embedding similarity,
  • substitute model.

PDP Complements

  • co-buy,
  • cart complement,
  • bundle association,
  • accessories.

Cart

  • complements,
  • frequently bought together,
  • replenishment,
  • threshold/free shipping suggestions.

Email/Push

  • precomputed personalized,
  • replenishment,
  • price drop,
  • back-in-stock,
  • wishlist reminder.

12. Home Feed Candidate Mix

Example:

candidate_policy: ecommerce_home_v1
sources:
  user_two_tower:
    quota: 500
  recent_category_affinity:
    quota: 300
  trending_region:
    quota: 200
  deals:
    quota: 100
  new_arrivals:
    quota: 100
  editorial:
    quota: 50

Use source contribution monitoring.


13. PDP Similar Items

Goal:

help shopper compare alternatives

Candidate sources:

  • same category,
  • similar price,
  • same brand,
  • different seller,
  • embedding similarity,
  • co-view,
  • co-click from PDP sessions.

Avoid:

  • exact same product duplicate,
  • out-of-stock,
  • irrelevant accessories if surface says similar.

Metrics:

  • click to PDP,
  • add to cart,
  • purchase,
  • comparison satisfaction,
  • bounce reduction.

14. Substitutes vs Complements

Substitute:

alternative to current product

Complement:

used with current product

Example:

  • camera body substitute: another camera body.
  • complement: memory card, lens, bag.

Do not mix unless UI explains.

Separate candidate sources and labels.


15. Cart Complements

Goal:

increase basket usefulness and value

Examples:

  • phone case for phone,
  • memory card for camera,
  • batteries for device,
  • refill for consumable,
  • compatible accessory.

Hard constraints:

  • compatibility,
  • inventory,
  • price relevance,
  • no duplicate already in cart,
  • not annoying.

Cart recommendations are sensitive because shopper is close to purchase.


16. Checkout Recommendations

Use very carefully.

Rules:

  • low distraction,
  • highly compatible,
  • available,
  • low return risk,
  • no policy ambiguity,
  • no slow dependency,
  • no high-risk sponsored clutter.

Sometimes no recommendation is better than bad checkout distraction.

Guardrails:

  • checkout completion,
  • cart abandonment,
  • latency,
  • return/refund.

17. Replenishment

For consumables:

estimate repurchase interval
recommend when likely needed

Features:

  • purchase history,
  • product consumption cycle,
  • quantity,
  • household size if known/allowed,
  • seasonality.

Examples:

  • coffee,
  • skincare,
  • pet food,
  • printer ink.

Avoid recommending durable products repeatedly.


18. Purchased Suppression

Purchased item logic:

Durable

Suppress same product for long window.

laptop, camera, refrigerator

Consumable

Replenish after interval.

coffee, detergent, pet food

Collectible/Fashion

May recommend similar/variant but avoid exact duplicate.

Domain semantics matter.


19. Inventory Freshness

Inventory can change quickly.

Strategies:

  • nearline inventory events,
  • final availability check,
  • stock confidence feature,
  • overfetch candidates,
  • fallback deeper list,
  • exclude low-stock for email/push if risk high.

Never send push/email for item likely unavailable if user opens soon.


20. Pricing and Promotion

Pricing features:

current price
discount percentage
price drop
relative price bucket
margin
promotion active
coupon eligible
free shipping threshold

But business boosts must be controlled.

Do not let discount/sponsored boost override relevance/safety.


21. Sponsored Products

Sponsored recommendations need:

  • disclosure,
  • campaign active,
  • relevance floor,
  • policy eligibility,
  • budget pacing,
  • frequency cap,
  • seller trust,
  • auction score if ads system,
  • organic/sponsored blending.

RecSys should not hide sponsored nature.


22. Marketplace Seller Health

Recommendation affects sellers.

Monitor:

seller exposure concentration
new seller time to first exposure
qualified seller exposure
seller trust-weighted exposure
counterfeit/fraud exposure
long-tail revenue
seller churn

Do not blindly concentrate all exposure to top sellers if marketplace health matters.


23. Seller Trust and Fraud

Features:

seller_rating
seller_age
fulfillment_success_rate
refund_rate
complaint_rate
counterfeit_risk
policy_violation_count
review_authenticity
shipping_delay_rate

High-risk sellers should be filtered/downranked.

Do not use raw sales without trust adjustment.


24. Return/Refund-Aware Ranking

Return/refund is negative long-term signal.

Model tasks:

p_purchase
p_return
p_refund
expected_margin
expected_satisfaction

Utility:

expected_value =
  purchase_value
  - return_cost
  - refund_risk
  - complaint_risk

This prevents pushing products that sell but disappoint.


25. Review and Rating Features

Use:

  • rating average,
  • review count,
  • verified purchase ratio,
  • recent review trend,
  • review authenticity,
  • rating by segment.

Smoothing important.

A 5-star product with 2 reviews should not outrank 4.6-star with 10,000 reviews blindly.


26. Content and Image Features

E-commerce products need multimodal features:

  • title embedding,
  • description embedding,
  • image embedding,
  • category path,
  • brand,
  • attributes,
  • style/color,
  • compatibility tags.

Cold-start products rely heavily on content features.

Metadata extraction quality matters.


27. Product Embeddings

Embedding families:

product_content_embedding
two_tower_item_embedding
image_style_embedding
co_buy_graph_embedding

Use cases:

  • similar products,
  • cold-start retrieval,
  • semantic search,
  • style matching,
  • cross-category complements.

Keep versions and index metadata.


28. E-commerce Feature Set

Feature groups:

User

category affinity
brand affinity
price preference
discount sensitivity
seller affinity
purchase frequency
return tendency

Item

quality score
CTR/CVR
rating/reviews
price bucket
discount
inventory confidence
return risk
seller trust

Cross

user_category_match
user_brand_match
price_fit
seen_count
similar_to_recent_item
cart_compatibility

Context

surface
device
region
season
query/cart context

29. Ranking Objective

E-commerce objective often multi-task:

p_view
p_add_to_cart
p_purchase
p_return
p_hide
expected_margin
seller_trust

Utility example:

score =
  1.0 * p_view
  + 3.0 * p_add_to_cart
  + 8.0 * p_purchase * expected_margin
  - 5.0 * p_return
  - 20.0 * p_report
  + freshness_bonus

Weights are business/product decisions.


30. Revenue vs Trust

Optimizing revenue alone can hurt trust.

Guardrails:

  • return rate,
  • complaint rate,
  • hide/report,
  • seller fraud,
  • low rating exposure,
  • repeat purchase,
  • retention.

Long-term marketplace requires trust.


31. Reranking for E-commerce

Reranking constraints:

max same seller
max same brand
max same category
sponsored cap
deal cap
new item exploration slots
seller diversity
price diversity
inventory confidence floor
no duplicate variants

Surface-specific.

Home feed wants diversity. PDP similar may prefer narrower category.


32. Variant Deduplication

Avoid showing:

same shirt in 10 colors
same phone storage variants
same seller duplicates

Use product family/dedup group.

Reranker picks representative based on:

  • availability,
  • price,
  • rating,
  • user preference,
  • image quality.

33. Price Diversity

Users may have price preference.

Slate can include:

  • within preferred range,
  • some cheaper alternatives,
  • premium option if relevant.

Do not recommend only expensive items if user historically buys budget items.

Price mismatch hurts conversion/trust.


34. New Product Cold-Start

Strategies:

  • content embeddings,
  • category priors,
  • seller trust prior,
  • brand prior,
  • exploration budget,
  • editorial/new-arrival source,
  • small exposure ramp,
  • quality/policy floor.

Metrics:

time_to_first_impression
time_to_first_click
new_product_cvr
new_product_report_rate

35. New Seller Cold-Start

Strategies:

  • trust verification,
  • limited exposure ramp,
  • category/region match,
  • quality floor,
  • fulfillment monitoring,
  • fraud checks,
  • seller onboarding metadata quality.

Do not give unlimited exposure to untrusted seller.


36. Regional and Logistics Constraints

E-commerce is local.

Constraints:

ship to region
delivery ETA
shipping cost
warehouse availability
regulatory restrictions
currency
tax
returns availability

Recommendation from unavailable region is bad.

Include logistics in eligibility/features.


37. Seasonality

Examples:

  • Ramadan/Eid,
  • Christmas,
  • back-to-school,
  • payday,
  • weather,
  • sports events.

Features:

seasonal category trend
regional calendar
promotion calendar
historical seasonality

Be careful not to overfit one event.


38. Email Recommendations

Email types:

daily deals
wishlist price drop
cart abandonment
back in stock
replenishment
new arrivals in followed category
seller/brand updates

Requirements:

  • consent,
  • unsubscribe,
  • frequency cap,
  • fresh inventory,
  • send-time validation,
  • tracking,
  • attribution,
  • no sensitive/policy risky content.

39. Push Recommendations

Push should be rare and high confidence.

Good triggers:

price drop on wished item
back in stock
order-related complement
replenishment due
limited-time relevant deal

Guardrails:

  • notification opt-in,
  • quiet hours,
  • fatigue,
  • unsubscribe,
  • hide/report,
  • conversion,
  • app uninstall risk.

40. Search and Category Interaction

Search/category ranking often separate, but RecSys can support:

  • personalized category boosts,
  • similar products,
  • query recommendations,
  • related searches,
  • product modules.

Search is intent-driven. Personalization should not override query intent too aggressively.


41. E-commerce Event Schema Additions

Purchase event:

{
  "event_id": "purchase_001",
  "user_id": "u123",
  "order_id": "order_789",
  "items": [
    {
      "product_id": "p1",
      "sku_id": "sku1",
      "offer_id": "offer1",
      "seller_id": "seller1",
      "price": 1200000,
      "currency": "IDR",
      "quantity": 1
    }
  ],
  "event_time": "2026-07-02T10:30:00Z"
}

Return/refund events should link to order/item.


42. Attribution Challenges

Attribution sources:

  • search,
  • recommendation home,
  • PDP similar,
  • email,
  • ads/sponsored,
  • cart module.

Need attribution model:

last-touch
multi-touch
surface-specific
incrementality experiment

Avoid over-crediting recommendations for purchases user would make anyway.


43. Offline Evaluation Metrics

Retrieval:

purchase recall@K
cart recall@K
similar item recall
complement recall

Ranking:

NDCG@K for add_to_cart/purchase
AUC/logloss for purchase/return
calibration

Slate:

seller/category diversity
duplicate variant rate
new item exposure
sponsored cap

Guardrails:

return/refund
hide/report
out-of-stock click rate

44. Online Experiments

Experiment primary metrics by surface:

Home

purchase per user
add-to-cart rate
CTR
retention

PDP Similar

similar module CTR
add-to-cart
purchase assist
bounce reduction

Cart

AOV
checkout completion
cart abandonment
return rate

Email/Push

incremental purchase
unsubscribe
notification disable

Always include guardrails.


45. Observability

E-commerce dashboards:

candidate source contribution
out-of-stock rejection
seller exposure
category exposure
return/refund by model
sponsored exposure
inventory freshness
cold-start product exposure
price bucket distribution
fallback rate
email/push fatigue

By region/category/seller.


46. Safety and Policy

Policy concerns:

restricted products
counterfeit
fraud sellers
misleading listings
dangerous products
age/region restrictions
sponsored disclosures
review manipulation

Final validation is mandatory for high-risk products.

Trust and safety should own taxonomy.


47. Privacy

E-commerce data can reveal sensitive interests.

Controls:

  • consent-aware personalization,
  • non-personalized path,
  • hide/reset recommendations,
  • retention,
  • debug redaction,
  • sensitive category handling,
  • no sensitive reason explanations.

Example risky explanation:

Recommended because you often buy medical products.

Use safe reason codes.


48. Implementation Blueprint

Services/modules:

rec-api
catalog-adapter
candidate-service
eligibility-service
ranking-service
slate-service
profile-store
feature-store
event-ingestion
batch-scoring
embedding-index-pipeline
experiment-integration
observability

First production slice:

home_feed + PDP_similar

Then add:

cart_complements
email_replenishment
push_price_drop
marketplace_health_reranking

49. E-commerce Minimum Feature/Roadmap Plan

Phase 1

popular/trending
profile category candidates
PDP similar by category
heuristic ranker
inventory/policy filter
decision/impression/click events

Phase 2

co-view/co-buy item-to-item
purchase/add-to-cart labels
GBDT ranker
seller trust features
variant dedup
email recommendations

Phase 3

two-tower retrieval
content/image embeddings
multi-task ranker purchase/return
batch scoring
cold-start exploration
marketplace exposure health

Phase 4

bandits
advanced causal incrementality
personalized promotion strategy
LLM-assisted catalog enrichment/explanation

50. Common E-commerce RecSys Failure Modes

50.1 Recommending Out-of-Stock Items

Inventory/final check failure.

50.2 Recommending Already Purchased Durable

Suppression semantics missing.

50.3 Same Variant Spam

Dedup group missing.

50.4 CTR Optimization Increases Returns

Wrong objective.

50.5 Sponsored Overrides Relevance

Trust loss.

Raw engagement abuse.

50.7 New Products Never Exposed

Cold-start failure.

50.8 Low-Trust Seller Gets Exposure

Trust feature missing.

50.9 Email Sends Stale Deals

Batch validation missing.

Availability mismatch.


51. Checklist E-commerce RecSys Readiness

[ ] Product/SKU/offer recommendation unit is defined.
[ ] Catalog eligibility includes inventory/region/policy/seller.
[ ] Surface-specific objectives are defined.
[ ] PDP similar and cart complement are separated.
[ ] Purchased suppression semantics exist by product type.
[ ] Candidate sources preserve provenance.
[ ] Ranking objective includes purchase and negative outcomes, not only click.
[ ] Return/refund risk is monitored.
[ ] Seller trust/fraud signals are included.
[ ] Variant/dedup group reranking exists.
[ ] Sponsored rules/disclosures are enforced.
[ ] Cold-start product/seller strategy exists.
[ ] Email/push consent and freshness checks exist.
[ ] Inventory final validation exists.
[ ] Marketplace exposure metrics exist.
[ ] Privacy-safe reason codes exist.
[ ] Out-of-stock/stale/precomputed rejection metrics exist.

52. Kesimpulan

E-commerce RecSys adalah domain yang sangat cocok untuk membangun kemampuan end-to-end karena ia memaksa kita memadukan relevance, conversion, inventory, pricing, seller trust, policy, returns, marketplace health, and personalization.

Prinsip utama:

  1. E-commerce recommendation optimizes useful shopping outcomes, not just clicks.
  2. Product/SKU/offer distinction matters.
  3. Eligibility must include inventory, region, policy, seller, and campaign state.
  4. Surface objectives differ: home, PDP, cart, checkout, email, push.
  5. Similar items and complements are different problems.
  6. Purchased suppression must understand durable vs consumable products.
  7. Ranking should account for purchase, return/refund, margin, trust, and satisfaction.
  8. Sponsored/business boosts must not override relevance/safety.
  9. Cold-start product/seller needs controlled exposure and trust floors.
  10. Marketplace health and long-term trust are first-class metrics.

Di Part 077, kita akan membahas Content Feed Recommendation System — domain feed/news/video/social/learning content yang punya tantangan berbeda: freshness, session intent, creator ecosystem, safety, diversity, dwell, fatigue, and long-term satisfaction.

Lesson Recap

You just completed lesson 76 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.