ML · schedule · Week 12 of 12 · ← 11

Week 12 — Kernel Methods

Mercer 1909, rediscovered in 1995: an implicit way to work in high-dimensional feature spaces. Every linear algorithm acquires a nonlinear cousin.

Lecture

Reproducing kernel Hilbert spaces · Mercer’s theorem · the representer theorem (Kimeldorf-Wahba 1971) · canonical kernels (linear, polynomial, RBF, Matérn) · kernelized ridge regression, kernel PCA, kernel SVM.

Read before the lecture

Schölkopf and Smola, *Learning with Kernels* (MIT Press 2002), chapter 2

Recitation — paper discussion

Wilson, Hu, Salakhutdinov, Xing, *Deep Kernel Learning* (AISTATS 2016) (paper)

Come ready to argue one side of each:

What does deep kernel learning give you that a plain neural network doesn't?
When are kernel methods still preferable to deep learning in 2026?

Reference text for this week: chapter 12 of the bilingual notes — EN PDF · FR PDF.