Foundations of Machine Learning

← cohort home

The operational schedule for Foundations of Machine Learning. Per-cohort dates fill in at intake; the structure below is stable across cohorts.

The single source of truth is _data/apprentissage-automatique.yml. Edits there flow through this page automatically.

Week	Title	Pitch	Detail
01	Introduction to Machine Learning	Arthur Samuel's 1959 checkers program: the program plays, the program improves, the code doesn't change. Sixty-five years later, every spam filter, credit scorer, and MRI tumor detector descends from that single intuition.	week 01 →
02	Linear and Polynomial Regression	Gauss 1801 predicting Ceres from forty days of observations: the original machine-learning success.	week 02 →
03	Classification — Logistic Regression, k-NN, Naive Bayes	Fisher's 1936 iris dataset: the first formal classification algorithm. The descendants now classify spam, fraud, tumors, signals.	week 03 →
04	Regularization — Ridge, Lasso, Elastic Net	Hoerl-Kennard 1970 ridge regression: an industrial fix for unstable least squares. Tibshirani 1996 LASSO: variable selection by optimization.	week 04 →
05	Support Vector Machines	Vapnik 1992: the largest-margin classifier. The dominant method from 1995 to 2012.	week 05 →
06	Decision Trees	Breiman, Friedman, Olshen, Stone 1984: classification by sequential yes/no questions. The transparent classifier.	week 06 →
07	Ensemble Methods	The BellKor 2009 Netflix Prize win: 100+ predictors combined linearly. Most Kaggle competitions since.	week 07 →
08	Unsupervised Learning — Clustering	Lloyd 1957 at Bell Labs: partition signals into groups, pick a representative, iterate. The most-taught clustering algorithm in ML.	week 08 →
09	Dimensionality Reduction	Pearson 1901: find the hyperplane minimizing orthogonal distances. The result surfaces the eigenvectors of the covariance matrix.	week 09 →
10	Bayesian Learning	Sahami 1998 Bayesian spam filter: every email service in the world running a variant. The Bayesian posture changes everything for uncertainty quantification.	week 10 →
11	Model Selection and Learning Theory	Why does any of this work? The statistical learning theory that bounds generalization error.	week 11 →
12	Kernel Methods	Mercer 1909, rediscovered in 1995: an implicit way to work in high-dimensional feature spaces. Every linear algorithm acquires a nonlinear cousin.	week 12 →

Operational notes

Default timezone: Africa/Lagos (UTC+1). Per-cohort timing negotiated at intake.
Lab notebooks and problem-set repos live in the cohort GitHub organization.
The bilingual lecture notes remain the reference text.