GenAI · schedule · Week 03 of 10 · ← 02 · 04 →

Week 03 — GPT and Text Generation

The GPT family from 2018's 117M-parameter GPT-1 to today's trillion-parameter frontier, plus the inference-time engineering that makes deployment feasible.

Lecture

The decoder-only transformer in detail · training objectives (causal LM, MLM, span infilling) · sampling strategies (temperature, top-$k$, top-$p$, beam search) · KV cache · speculative decoding · batched inference.

Read before the lecture

Radford et al., *Improving Language Understanding by Generative Pre-Training* (OpenAI 2018, GPT-1)
Brown et al., *Language Models are Few-Shot Learners* (NeurIPS 2020, GPT-3)

Code lab

Lab 2 — Sampling strategy analysis

On a 1B-parameter open model (e.g., Pythia-1B), generate the same prompt with five sampling strategies. Quantify diversity, coherence, factuality with simple metrics.

Notebook: lab02-sampling.ipynb · Dataset: —

Reference text for this week: chapter 03 of the bilingual notes — EN PDF · FR PDF.