Week 12 — Kernel Methods
Mercer 1909, rediscovered in 1995: an implicit way to work in high-dimensional feature spaces. Every linear algorithm acquires a nonlinear cousin.
Week 12 — Kernel Methods
Mercer 1909, rediscovered in 1995: an implicit way to work in high-dimensional feature spaces. Every linear algorithm acquires a nonlinear cousin.
Lecture
Reproducing kernel Hilbert spaces · Mercer’s theorem · the representer theorem (Kimeldorf-Wahba 1971) · canonical kernels (linear, polynomial, RBF, Matérn) · kernelized ridge regression, kernel PCA, kernel SVM.
Read before the lecture
- Schölkopf and Smola, *Learning with Kernels* (MIT Press 2002), chapter 2
Recitation — paper discussion
Wilson, Hu, Salakhutdinov, Xing, *Deep Kernel Learning* (AISTATS 2016) (paper)
Come ready to argue one side of each:
- What does deep kernel learning give you that a plain neural network doesn't?
- When are kernel methods still preferable to deep learning in 2026?
Reference text for this week: chapter 12 of the bilingual notes — EN PDF · FR PDF.