Week 04 — Recommender systems

Module 4: collaborative filtering, matrix factorization, hybrid models, the gap between offline metrics and online A/B results.

Module M4  |  ← schedule |  ← week 03 |  week 05 →

How Netflix-style systems actually work, and the honest evaluation problem they create.

What you ship this week

Hybrid recommender on a real e-commerce dataset, evaluated against three offline metrics with an explicit discussion of why they disagree.

Due Friday 18:00 (Africa/Lagos (UTC+1))
Submission Drop the repo URL into the week's cohort channel. Peer-review pairing announced Monday of next week.
Rubric Pass / revise. Pass requires green CI, tests covering the public API, and a README a stranger can follow to install and run the code.

Live sessions and labs

Default weekly cadence below. Cohort-specific dates and Zoom links fill in at intake.

Day Time Block Recording
Mon 09:00-12:00 Live instruction + code-along (post-session)
Mon 14:00-16:00 Independent lab work + TA office hours (post-session)
Tue 09:00-12:00 Live instruction + code-along (post-session)
Tue 14:00-16:00 Independent lab work + TA office hours (post-session)
Wed 09:00-12:00 Live instruction + code-along (post-session)
Wed 14:00-16:00 Independent lab work + TA office hours (post-session)
Thu 09:00-12:00 Live instruction + code-along (post-session)
Thu 14:00-16:00 Independent lab work + TA office hours (post-session)
Fri 10:00-11:00 Industry speaker (post-session)
Fri 11:30-12:30 Lab review (post-session)
Fri 14:00-15:00 Cohort retrospective (post-session)

Learning outcomes

By the end of the week, every participant will:

  1. Implement collaborative filtering (user-based, item-based, matrix factorization).
  2. Implement content-based filtering with embeddings.
  3. Build a hybrid recommender and evaluate with offline metrics (precision@k, NDCG, MAP).
  4. Understand why offline metrics often disagree with online A/B results.

Topics covered

The recommendation problem · explicit vs implicit feedback · collaborative filtering · matrix factorization (SVD, ALS, NMF) · content-based filtering with embeddings · hybrid models · evaluation: precision@k, recall@k, NDCG, MAP, online vs offline metrics · cold-start, popularity bias, filter bubbles · the you’ll never know until you A/B test problem.

Labs

Lab 1 — MovieLens collaborative filter

Build a user-item matrix factorization on MovieLens 1M, first with raw SVD then with ALS. Compare RMSE on held-out ratings.

Dataset: MovieLens 1M (GroupLens).

Lab 2 — Hybrid e-commerce recommender

Combine collaborative + content-based + popularity baselines on a public e-commerce dataset. Tune the blending weights against precision@10.

Dataset: Olist Brazilian e-commerce public dataset (Kaggle).

Lab 3 — Three metrics, three rankings

Take three candidate recommenders from Labs 1 and 2. Evaluate against precision@10, NDCG@10, and MAP@10. Write a 400-word memo explaining why they rank the models differently.

Dataset: Same as Labs 1 and 2.

Readings

Mandatory

Optional deepening