Week 07 — Ensemble Methods

The BellKor 2009 Netflix Prize win: 100+ predictors combined linearly. Most Kaggle competitions since.

ML  ·  schedule  ·  Week 07 of 12 ·  ← 06 ·  08 →

Week 07 — Ensemble Methods

The BellKor 2009 Netflix Prize win: 100+ predictors combined linearly. Most Kaggle competitions since.

Lecture

Bagging and Breiman’s random forest (2001) · boosting (AdaBoost, gradient boosting) · stacking · the bias-variance decomposition of ensemble error · XGBoost (Chen 2016) and the GBM tooling stack.

Read before the lecture

Code lab

Lab 3 — Gradient boosting in production

Train XGBoost, LightGBM, and CatBoost on the same dataset. Tune hyperparameters. Compare training time, inference time, and accuracy. Audit feature importance with SHAP.

Notebook: lab03-boosting.ipynb  ·  Dataset: Kaggle bank-loan default (Cameroon subset).


Reference text for this week: chapter 07 of the bilingual notes — EN PDF · FR PDF.