Week 03 — GPT and Text Generation

The GPT family from 2018's 117M-parameter GPT-1 to today's trillion-parameter frontier, plus the inference-time engineering that makes deployment feasible.

GenAI  ·  schedule  ·  Week 03 of 10 ·  ← 02 ·  04 →

Week 03 — GPT and Text Generation

The GPT family from 2018's 117M-parameter GPT-1 to today's trillion-parameter frontier, plus the inference-time engineering that makes deployment feasible.

Lecture

The decoder-only transformer in detail · training objectives (causal LM, MLM, span infilling) · sampling strategies (temperature, top-$k$, top-$p$, beam search) · KV cache · speculative decoding · batched inference.

Read before the lecture

Code lab

Lab 2 — Sampling strategy analysis

On a 1B-parameter open model (e.g., Pythia-1B), generate the same prompt with five sampling strategies. Quantify diversity, coherence, factuality with simple metrics.

Notebook: lab02-sampling.ipynb  ·  Dataset: —


Reference text for this week: chapter 03 of the bilingual notes — EN PDF · FR PDF.