GenAI · schedule · Week 01 of 10 · 02 →

Week 01 — Foundations of Language Models

What a language model is, what it isn't, and why ChatGPT was the convergence of three decade-long research programs.

Lecture

From $n$-gram models to neural LMs · the language-modeling objective (next-token prediction) · perplexity · the convergence of the transformer + scale + RLHF · scaling laws (Kaplan 2020, Hoffmann 2022) · what ‘capability’ and ‘alignment’ mean.

Read before the lecture

Kaplan et al., *Scaling Laws for Neural Language Models* (2020)
Hoffmann et al., *Training Compute-Optimal Large Language Models* (NeurIPS 2022, the Chinchilla paper)

Reference text for this week: chapter 01 of the bilingual notes — EN PDF · FR PDF.