Foundations of Machine Learning — Schedule
Week-by-week schedule of Foundations of Machine Learning.
The operational schedule for Foundations of Machine Learning. Per-cohort dates fill in at intake; the structure below is stable across cohorts.
The single source of truth is _data/apprentissage-automatique.yml. Edits there flow through this page automatically.
| Week | Title | Pitch | Detail |
|---|---|---|---|
| 01 | Introduction to Machine Learning | Arthur Samuel's 1959 checkers program: the program plays, the program improves, the code doesn't change. Sixty-five years later, every spam filter, credit scorer, and MRI tumor detector descends from that single intuition. | week 01 → |
| 02 | Linear and Polynomial Regression | Gauss 1801 predicting Ceres from forty days of observations: the original machine-learning success. | week 02 → |
| 03 | Classification — Logistic Regression, k-NN, Naive Bayes | Fisher's 1936 iris dataset: the first formal classification algorithm. The descendants now classify spam, fraud, tumors, signals. | week 03 → |
| 04 | Regularization — Ridge, Lasso, Elastic Net | Hoerl-Kennard 1970 ridge regression: an industrial fix for unstable least squares. Tibshirani 1996 LASSO: variable selection by optimization. | week 04 → |
| 05 | Support Vector Machines | Vapnik 1992: the largest-margin classifier. The dominant method from 1995 to 2012. | week 05 → |
| 06 | Decision Trees | Breiman, Friedman, Olshen, Stone 1984: classification by sequential yes/no questions. The transparent classifier. | week 06 → |
| 07 | Ensemble Methods | The BellKor 2009 Netflix Prize win: 100+ predictors combined linearly. Most Kaggle competitions since. | week 07 → |
| 08 | Unsupervised Learning — Clustering | Lloyd 1957 at Bell Labs: partition signals into groups, pick a representative, iterate. The most-taught clustering algorithm in ML. | week 08 → |
| 09 | Dimensionality Reduction | Pearson 1901: find the hyperplane minimizing orthogonal distances. The result surfaces the eigenvectors of the covariance matrix. | week 09 → |
| 10 | Bayesian Learning | Sahami 1998 Bayesian spam filter: every email service in the world running a variant. The Bayesian posture changes everything for uncertainty quantification. | week 10 → |
| 11 | Model Selection and Learning Theory | Why does any of this work? The statistical learning theory that bounds generalization error. | week 11 → |
| 12 | Kernel Methods | Mercer 1909, rediscovered in 1995: an implicit way to work in high-dimensional feature spaces. Every linear algorithm acquires a nonlinear cousin. | week 12 → |
Operational notes
- Default timezone: Africa/Lagos (UTC+1). Per-cohort timing negotiated at intake.
- Lab notebooks and problem-set repos live in the cohort GitHub organization.
- The bilingual lecture notes remain the reference text.