Week 05 — Fine-Tuning LLMs
Adapting a pretrained model to a domain or a task. LoRA, QLoRA, and the parameter-efficient fine-tuning revolution.
Week 05 — Fine-Tuning LLMs
Adapting a pretrained model to a domain or a task. LoRA, QLoRA, and the parameter-efficient fine-tuning revolution.
Lecture
Full fine-tuning · LoRA (Hu et al. 2021) and QLoRA (Dettmers et al. 2023) · adapters and prefix-tuning · instruction tuning · RLHF (Christiano 2017 → Ouyang 2022) · DPO (Rafailov 2023) · the dataset engineering behind quality fine-tunes.
Read before the lecture
Code lab
Lab 3 — LoRA fine-tune a small open model
Fine-tune Mistral-7B (or Llama-3.1-8B) with LoRA on a domain dataset. Compare zero-shot vs fine-tuned performance on a 50-example task-specific eval.
Notebook: lab03-lora.ipynb · Dataset: Choose from a pre-curated set (medical Q&A, African-language instructions, code Q&A).
Reference text for this week: chapter 05 of the bilingual notes — EN PDF · FR PDF.