Learn Build From Scratch Recommendations System Part 076 Ecommerce Recommendation System
title: Build From Scratch Recommendations System - Part 076 description: Mendesain ecommerce recommendation system production-grade: home feed, PDP similar items, cart complements, checkout safeguards, email/push, product/seller/catalog modeling, inventory, pricing, promotions, returns, cold-start, ranking objectives, marketplace health, and implementation blueprint. series: learn-build-from-scratch-recommendations-system seriesTitle: Build From Scratch: Enterprise Recommendations System order: 76 partTitle: Ecommerce Recommendation System tags:
- recommendation-system
- recsys
- ecommerce
- marketplace
- java
- build-from-scratch
- series date: 2026-07-02
Part 076 — Ecommerce Recommendation System
E-commerce adalah salah satu domain paling klasik dan paling kompleks untuk recommendation system.
Sistem harus merekomendasikan:
- produk di home,
- similar items di PDP,
- complements di cart,
- upsell/cross-sell,
- replenishment,
- bundles,
- new arrivals,
- deals/promotions,
- email/push recommendations,
- search/category ranking support,
- seller/brand discovery,
- marketplace long-tail exposure.
Tetapi e-commerce juga punya constraint keras:
- inventory,
- price,
- shipping,
- region,
- seller trust,
- counterfeit/fraud,
- return/refund,
- campaign eligibility,
- sponsored disclosure,
- purchased suppression,
- privacy,
- cold-start product/seller,
- business margin,
- marketplace fairness.
Part ini memetakan semua konsep sebelumnya ke domain e-commerce production-grade: domain model, surfaces, candidate sources, ranking objectives, eligibility, features, reranking, safety, experimentation, observability, and implementation blueprint.
1. E-commerce RecSys Mental Model
E-commerce RecSys adalah decision system:
Given shopper, context, surface, and constraints,
select product slate that maximizes useful shopping outcomes
while respecting availability, trust, policy, and marketplace health.
Useful outcomes bukan hanya click.
Outcomes:
product detail view
add to cart
purchase
repeat purchase
low return/refund
seller trust
customer satisfaction
margin/revenue
long-term retention
Clickbait product recommendation bisa menaikkan CTR tetapi menurunkan purchase/trust.
2. Key Surfaces
E-commerce recommendation surfaces:
| Surface | Purpose |
|---|---|
| Home feed | discovery/personalization |
| PDP similar items | alternatives/substitutes |
| PDP frequently bought together | complements |
| Cart recommendations | cross-sell/complements |
| Checkout recommendations | very strict, low-risk |
| Search/category side modules | discovery within intent |
| scheduled reactivation | |
| Push | high-confidence timely nudges |
| Order confirmation | replenishment/next-step |
| Account page | reorder/replenishment |
| Seller/brand page | seller catalog discovery |
Each surface needs different objective and safety strictness.
3. Product Entity Model
E-commerce item can be complex.
Entities:
product
sku
variant
offer
seller listing
brand
category
bundle
promotion
campaign
inventory location
shipping option
Recommendation unit may be:
- product,
- SKU,
- offer,
- listing,
- bundle.
Choose carefully.
Example:
recommend product family on home,
recommend SKU/offer at checkout.
4. Product vs SKU vs Offer
Product
Conceptual item.
iPhone 15
SKU/Variant
Specific variation.
iPhone 15, 128GB, Blue
Offer/Listing
Seller-specific purchasable offer.
Seller A, price X, stock Y
For recommendation, ranking may use product-level but final eligibility must validate offer-level.
5. Catalog Attributes
Core attributes:
product_id
sku_id
offer_id
seller_id
brand
category
title
description
price
currency
discount
rating
review_count
inventory
shipping_region
delivery_eta
return_policy
condition
image_quality
policy_state
created_at
updated_at
Feature quality depends on catalog quality.
6. Eligibility Rules
Hard filters:
active product
recommendable policy state
in stock
available in region
shippable to user
seller active
not counterfeit/fraud flagged
not blocked brand/seller
not already purchased if durable
campaign active if promoted
age/region restriction if applicable
price visible
Eligibility should run at candidate and final validation.
Inventory/price can change quickly.
7. Surface-Specific Eligibility
Home feed:
active, recommendable, region available, likely in stock
PDP similar:
same/related category, purchasable, not same exact SKU
Cart:
compatible with cart, available, not duplicate
Checkout:
strict inventory/price/shipping final check
Push/email:
high confidence, available at send/open time if possible
The closer to purchase, the stricter.
8. Feedback Events
E-commerce events:
impression
product_view
image_view
wishlist/save
add_to_cart
remove_from_cart
checkout_start
purchase
return
refund
review
rating
seller_follow
hide/not_interested
block_seller
price_alert
search
category_browse
Label semantics matter.
A click without purchase may be weak.
A purchase followed by return may be negative long-term.
9. Label Hierarchy
Example relevance grades:
hide/report: negative
impression no action: 0
product view: 1
wishlist/save: 2
add to cart: 3
purchase: 5
purchase kept/no return: 7
return/refund: subtract
Do not optimize only product views.
Cart and purchase signals are more valuable but delayed/sparser.
10. Attribution Windows
Example:
product_view: 30m
add_to_cart: 24h
purchase: 7d
return: 30d
repeat_purchase: 90d
Purchase attribution must handle:
- multiple impressions,
- search vs recommendation,
- email vs home,
- cart additions,
- delayed purchase.
Define clearly.
11. Candidate Sources by Surface
Home
- user two-tower,
- category affinity,
- trending by region/category,
- new arrivals,
- deals,
- similar to recent views,
- seller/brand affinity,
- editorial.
PDP Similar
- item-to-item co-view,
- same category/brand,
- embedding similarity,
- substitute model.
PDP Complements
- co-buy,
- cart complement,
- bundle association,
- accessories.
Cart
- complements,
- frequently bought together,
- replenishment,
- threshold/free shipping suggestions.
Email/Push
- precomputed personalized,
- replenishment,
- price drop,
- back-in-stock,
- wishlist reminder.
12. Home Feed Candidate Mix
Example:
candidate_policy: ecommerce_home_v1
sources:
user_two_tower:
quota: 500
recent_category_affinity:
quota: 300
trending_region:
quota: 200
deals:
quota: 100
new_arrivals:
quota: 100
editorial:
quota: 50
Use source contribution monitoring.
13. PDP Similar Items
Goal:
help shopper compare alternatives
Candidate sources:
- same category,
- similar price,
- same brand,
- different seller,
- embedding similarity,
- co-view,
- co-click from PDP sessions.
Avoid:
- exact same product duplicate,
- out-of-stock,
- irrelevant accessories if surface says similar.
Metrics:
- click to PDP,
- add to cart,
- purchase,
- comparison satisfaction,
- bounce reduction.
14. Substitutes vs Complements
Substitute:
alternative to current product
Complement:
used with current product
Example:
- camera body substitute: another camera body.
- complement: memory card, lens, bag.
Do not mix unless UI explains.
Separate candidate sources and labels.
15. Cart Complements
Goal:
increase basket usefulness and value
Examples:
- phone case for phone,
- memory card for camera,
- batteries for device,
- refill for consumable,
- compatible accessory.
Hard constraints:
- compatibility,
- inventory,
- price relevance,
- no duplicate already in cart,
- not annoying.
Cart recommendations are sensitive because shopper is close to purchase.
16. Checkout Recommendations
Use very carefully.
Rules:
- low distraction,
- highly compatible,
- available,
- low return risk,
- no policy ambiguity,
- no slow dependency,
- no high-risk sponsored clutter.
Sometimes no recommendation is better than bad checkout distraction.
Guardrails:
- checkout completion,
- cart abandonment,
- latency,
- return/refund.
17. Replenishment
For consumables:
estimate repurchase interval
recommend when likely needed
Features:
- purchase history,
- product consumption cycle,
- quantity,
- household size if known/allowed,
- seasonality.
Examples:
- coffee,
- skincare,
- pet food,
- printer ink.
Avoid recommending durable products repeatedly.
18. Purchased Suppression
Purchased item logic:
Durable
Suppress same product for long window.
laptop, camera, refrigerator
Consumable
Replenish after interval.
coffee, detergent, pet food
Collectible/Fashion
May recommend similar/variant but avoid exact duplicate.
Domain semantics matter.
19. Inventory Freshness
Inventory can change quickly.
Strategies:
- nearline inventory events,
- final availability check,
- stock confidence feature,
- overfetch candidates,
- fallback deeper list,
- exclude low-stock for email/push if risk high.
Never send push/email for item likely unavailable if user opens soon.
20. Pricing and Promotion
Pricing features:
current price
discount percentage
price drop
relative price bucket
margin
promotion active
coupon eligible
free shipping threshold
But business boosts must be controlled.
Do not let discount/sponsored boost override relevance/safety.
21. Sponsored Products
Sponsored recommendations need:
- disclosure,
- campaign active,
- relevance floor,
- policy eligibility,
- budget pacing,
- frequency cap,
- seller trust,
- auction score if ads system,
- organic/sponsored blending.
RecSys should not hide sponsored nature.
22. Marketplace Seller Health
Recommendation affects sellers.
Monitor:
seller exposure concentration
new seller time to first exposure
qualified seller exposure
seller trust-weighted exposure
counterfeit/fraud exposure
long-tail revenue
seller churn
Do not blindly concentrate all exposure to top sellers if marketplace health matters.
23. Seller Trust and Fraud
Features:
seller_rating
seller_age
fulfillment_success_rate
refund_rate
complaint_rate
counterfeit_risk
policy_violation_count
review_authenticity
shipping_delay_rate
High-risk sellers should be filtered/downranked.
Do not use raw sales without trust adjustment.
24. Return/Refund-Aware Ranking
Return/refund is negative long-term signal.
Model tasks:
p_purchase
p_return
p_refund
expected_margin
expected_satisfaction
Utility:
expected_value =
purchase_value
- return_cost
- refund_risk
- complaint_risk
This prevents pushing products that sell but disappoint.
25. Review and Rating Features
Use:
- rating average,
- review count,
- verified purchase ratio,
- recent review trend,
- review authenticity,
- rating by segment.
Smoothing important.
A 5-star product with 2 reviews should not outrank 4.6-star with 10,000 reviews blindly.
26. Content and Image Features
E-commerce products need multimodal features:
- title embedding,
- description embedding,
- image embedding,
- category path,
- brand,
- attributes,
- style/color,
- compatibility tags.
Cold-start products rely heavily on content features.
Metadata extraction quality matters.
27. Product Embeddings
Embedding families:
product_content_embedding
two_tower_item_embedding
image_style_embedding
co_buy_graph_embedding
Use cases:
- similar products,
- cold-start retrieval,
- semantic search,
- style matching,
- cross-category complements.
Keep versions and index metadata.
28. E-commerce Feature Set
Feature groups:
User
category affinity
brand affinity
price preference
discount sensitivity
seller affinity
purchase frequency
return tendency
Item
quality score
CTR/CVR
rating/reviews
price bucket
discount
inventory confidence
return risk
seller trust
Cross
user_category_match
user_brand_match
price_fit
seen_count
similar_to_recent_item
cart_compatibility
Context
surface
device
region
season
query/cart context
29. Ranking Objective
E-commerce objective often multi-task:
p_view
p_add_to_cart
p_purchase
p_return
p_hide
expected_margin
seller_trust
Utility example:
score =
1.0 * p_view
+ 3.0 * p_add_to_cart
+ 8.0 * p_purchase * expected_margin
- 5.0 * p_return
- 20.0 * p_report
+ freshness_bonus
Weights are business/product decisions.
30. Revenue vs Trust
Optimizing revenue alone can hurt trust.
Guardrails:
- return rate,
- complaint rate,
- hide/report,
- seller fraud,
- low rating exposure,
- repeat purchase,
- retention.
Long-term marketplace requires trust.
31. Reranking for E-commerce
Reranking constraints:
max same seller
max same brand
max same category
sponsored cap
deal cap
new item exploration slots
seller diversity
price diversity
inventory confidence floor
no duplicate variants
Surface-specific.
Home feed wants diversity. PDP similar may prefer narrower category.
32. Variant Deduplication
Avoid showing:
same shirt in 10 colors
same phone storage variants
same seller duplicates
Use product family/dedup group.
Reranker picks representative based on:
- availability,
- price,
- rating,
- user preference,
- image quality.
33. Price Diversity
Users may have price preference.
Slate can include:
- within preferred range,
- some cheaper alternatives,
- premium option if relevant.
Do not recommend only expensive items if user historically buys budget items.
Price mismatch hurts conversion/trust.
34. New Product Cold-Start
Strategies:
- content embeddings,
- category priors,
- seller trust prior,
- brand prior,
- exploration budget,
- editorial/new-arrival source,
- small exposure ramp,
- quality/policy floor.
Metrics:
time_to_first_impression
time_to_first_click
new_product_cvr
new_product_report_rate
35. New Seller Cold-Start
Strategies:
- trust verification,
- limited exposure ramp,
- category/region match,
- quality floor,
- fulfillment monitoring,
- fraud checks,
- seller onboarding metadata quality.
Do not give unlimited exposure to untrusted seller.
36. Regional and Logistics Constraints
E-commerce is local.
Constraints:
ship to region
delivery ETA
shipping cost
warehouse availability
regulatory restrictions
currency
tax
returns availability
Recommendation from unavailable region is bad.
Include logistics in eligibility/features.
37. Seasonality
Examples:
- Ramadan/Eid,
- Christmas,
- back-to-school,
- payday,
- weather,
- sports events.
Features:
seasonal category trend
regional calendar
promotion calendar
historical seasonality
Be careful not to overfit one event.
38. Email Recommendations
Email types:
daily deals
wishlist price drop
cart abandonment
back in stock
replenishment
new arrivals in followed category
seller/brand updates
Requirements:
- consent,
- unsubscribe,
- frequency cap,
- fresh inventory,
- send-time validation,
- tracking,
- attribution,
- no sensitive/policy risky content.
39. Push Recommendations
Push should be rare and high confidence.
Good triggers:
price drop on wished item
back in stock
order-related complement
replenishment due
limited-time relevant deal
Guardrails:
- notification opt-in,
- quiet hours,
- fatigue,
- unsubscribe,
- hide/report,
- conversion,
- app uninstall risk.
40. Search and Category Interaction
Search/category ranking often separate, but RecSys can support:
- personalized category boosts,
- similar products,
- query recommendations,
- related searches,
- product modules.
Search is intent-driven. Personalization should not override query intent too aggressively.
41. E-commerce Event Schema Additions
Purchase event:
{
"event_id": "purchase_001",
"user_id": "u123",
"order_id": "order_789",
"items": [
{
"product_id": "p1",
"sku_id": "sku1",
"offer_id": "offer1",
"seller_id": "seller1",
"price": 1200000,
"currency": "IDR",
"quantity": 1
}
],
"event_time": "2026-07-02T10:30:00Z"
}
Return/refund events should link to order/item.
42. Attribution Challenges
Attribution sources:
- search,
- recommendation home,
- PDP similar,
- email,
- ads/sponsored,
- cart module.
Need attribution model:
last-touch
multi-touch
surface-specific
incrementality experiment
Avoid over-crediting recommendations for purchases user would make anyway.
43. Offline Evaluation Metrics
Retrieval:
purchase recall@K
cart recall@K
similar item recall
complement recall
Ranking:
NDCG@K for add_to_cart/purchase
AUC/logloss for purchase/return
calibration
Slate:
seller/category diversity
duplicate variant rate
new item exposure
sponsored cap
Guardrails:
return/refund
hide/report
out-of-stock click rate
44. Online Experiments
Experiment primary metrics by surface:
Home
purchase per user
add-to-cart rate
CTR
retention
PDP Similar
similar module CTR
add-to-cart
purchase assist
bounce reduction
Cart
AOV
checkout completion
cart abandonment
return rate
Email/Push
incremental purchase
unsubscribe
notification disable
Always include guardrails.
45. Observability
E-commerce dashboards:
candidate source contribution
out-of-stock rejection
seller exposure
category exposure
return/refund by model
sponsored exposure
inventory freshness
cold-start product exposure
price bucket distribution
fallback rate
email/push fatigue
By region/category/seller.
46. Safety and Policy
Policy concerns:
restricted products
counterfeit
fraud sellers
misleading listings
dangerous products
age/region restrictions
sponsored disclosures
review manipulation
Final validation is mandatory for high-risk products.
Trust and safety should own taxonomy.
47. Privacy
E-commerce data can reveal sensitive interests.
Controls:
- consent-aware personalization,
- non-personalized path,
- hide/reset recommendations,
- retention,
- debug redaction,
- sensitive category handling,
- no sensitive reason explanations.
Example risky explanation:
Recommended because you often buy medical products.
Use safe reason codes.
48. Implementation Blueprint
Services/modules:
rec-api
catalog-adapter
candidate-service
eligibility-service
ranking-service
slate-service
profile-store
feature-store
event-ingestion
batch-scoring
embedding-index-pipeline
experiment-integration
observability
First production slice:
home_feed + PDP_similar
Then add:
cart_complements
email_replenishment
push_price_drop
marketplace_health_reranking
49. E-commerce Minimum Feature/Roadmap Plan
Phase 1
popular/trending
profile category candidates
PDP similar by category
heuristic ranker
inventory/policy filter
decision/impression/click events
Phase 2
co-view/co-buy item-to-item
purchase/add-to-cart labels
GBDT ranker
seller trust features
variant dedup
email recommendations
Phase 3
two-tower retrieval
content/image embeddings
multi-task ranker purchase/return
batch scoring
cold-start exploration
marketplace exposure health
Phase 4
bandits
advanced causal incrementality
personalized promotion strategy
LLM-assisted catalog enrichment/explanation
50. Common E-commerce RecSys Failure Modes
50.1 Recommending Out-of-Stock Items
Inventory/final check failure.
50.2 Recommending Already Purchased Durable
Suppression semantics missing.
50.3 Same Variant Spam
Dedup group missing.
50.4 CTR Optimization Increases Returns
Wrong objective.
50.5 Sponsored Overrides Relevance
Trust loss.
50.6 Trending Amplifies Fraud
Raw engagement abuse.
50.7 New Products Never Exposed
Cold-start failure.
50.8 Low-Trust Seller Gets Exposure
Trust feature missing.
50.9 Email Sends Stale Deals
Batch validation missing.
50.10 Global Popular Ignores Region/Logistics
Availability mismatch.
51. Checklist E-commerce RecSys Readiness
[ ] Product/SKU/offer recommendation unit is defined.
[ ] Catalog eligibility includes inventory/region/policy/seller.
[ ] Surface-specific objectives are defined.
[ ] PDP similar and cart complement are separated.
[ ] Purchased suppression semantics exist by product type.
[ ] Candidate sources preserve provenance.
[ ] Ranking objective includes purchase and negative outcomes, not only click.
[ ] Return/refund risk is monitored.
[ ] Seller trust/fraud signals are included.
[ ] Variant/dedup group reranking exists.
[ ] Sponsored rules/disclosures are enforced.
[ ] Cold-start product/seller strategy exists.
[ ] Email/push consent and freshness checks exist.
[ ] Inventory final validation exists.
[ ] Marketplace exposure metrics exist.
[ ] Privacy-safe reason codes exist.
[ ] Out-of-stock/stale/precomputed rejection metrics exist.
52. Kesimpulan
E-commerce RecSys adalah domain yang sangat cocok untuk membangun kemampuan end-to-end karena ia memaksa kita memadukan relevance, conversion, inventory, pricing, seller trust, policy, returns, marketplace health, and personalization.
Prinsip utama:
- E-commerce recommendation optimizes useful shopping outcomes, not just clicks.
- Product/SKU/offer distinction matters.
- Eligibility must include inventory, region, policy, seller, and campaign state.
- Surface objectives differ: home, PDP, cart, checkout, email, push.
- Similar items and complements are different problems.
- Purchased suppression must understand durable vs consumable products.
- Ranking should account for purchase, return/refund, margin, trust, and satisfaction.
- Sponsored/business boosts must not override relevance/safety.
- Cold-start product/seller needs controlled exposure and trust floors.
- Marketplace health and long-term trust are first-class metrics.
Di Part 077, kita akan membahas Content Feed Recommendation System — domain feed/news/video/social/learning content yang punya tantangan berbeda: freshness, session intent, creator ecosystem, safety, diversity, dwell, fatigue, and long-term satisfaction.
You just completed lesson 76 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.