Week 06 — Retrieval-Augmented Generation
Don't make the model memorize everything — let it look things up. The dominant production pattern for grounded LLM applications.
Week 06 — Retrieval-Augmented Generation
Don't make the model memorize everything — let it look things up. The dominant production pattern for grounded LLM applications.
Lecture
The RAG architecture (Lewis et al. 2020) · chunking strategies · embedding models (dense, sparse, hybrid) · vector databases (FAISS, Chroma, Qdrant, Pinecone) · reranking · prompt assembly · evaluation (faithfulness, context relevance, answer correctness).
Read before the lecture
Code lab
Lab 4 — RAG over a domain corpus
Index a set of public domain documents (WHO/AFRO reports, scientific papers, course PDFs) in Chroma. Build retrieval + reranking + generation. Evaluate on a 30-question held-out set.
Notebook: lab04-rag.ipynb · Dataset: Curated public PDFs (provided).
Reference text for this week: chapter 06 of the bilingual notes — EN PDF · FR PDF.