Week 07 — Ensemble Methods
The BellKor 2009 Netflix Prize win: 100+ predictors combined linearly. Most Kaggle competitions since.
Week 07 — Ensemble Methods
The BellKor 2009 Netflix Prize win: 100+ predictors combined linearly. Most Kaggle competitions since.
Lecture
Bagging and Breiman’s random forest (2001) · boosting (AdaBoost, gradient boosting) · stacking · the bias-variance decomposition of ensemble error · XGBoost (Chen 2016) and the GBM tooling stack.
Read before the lecture
Code lab
Lab 3 — Gradient boosting in production
Train XGBoost, LightGBM, and CatBoost on the same dataset. Tune hyperparameters. Compare training time, inference time, and accuracy. Audit feature importance with SHAP.
Notebook: lab03-boosting.ipynb · Dataset: Kaggle bank-loan default (Cameroon subset).
Reference text for this week: chapter 07 of the bilingual notes — EN PDF · FR PDF.