GenAI · schedule · Week 05 of 10 · ← 04 · 06 →

Week 05 — Fine-Tuning LLMs

Adapting a pretrained model to a domain or a task. LoRA, QLoRA, and the parameter-efficient fine-tuning revolution.

Lecture

Full fine-tuning · LoRA (Hu et al. 2021) and QLoRA (Dettmers et al. 2023) · adapters and prefix-tuning · instruction tuning · RLHF (Christiano 2017 → Ouyang 2022) · DPO (Rafailov 2023) · the dataset engineering behind quality fine-tunes.

Read before the lecture

Hu et al., *LoRA: Low-Rank Adaptation of Large Language Models* (ICLR 2022)

Code lab

Lab 3 — LoRA fine-tune a small open model

Fine-tune Mistral-7B (or Llama-3.1-8B) with LoRA on a domain dataset. Compare zero-shot vs fine-tuned performance on a 50-example task-specific eval.

Notebook: lab03-lora.ipynb · Dataset: Choose from a pre-curated set (medical Q&A, African-language instructions, code Q&A).

Reference text for this week: chapter 05 of the bilingual notes — EN PDF · FR PDF.