MLOps · schedule · Week 11 of 12 · ← 10 · 12 →

Week 11 — Reproducibility in Research — Standards and Best Practices

Pineau and Henderson showed in 2017 that identical RL code with different random seeds produces 3× different learning curves. Reproducibility is engineering, not virtue.

Lecture

The reproducibility crisis (Nature 2016) · ML-specific reproducibility (Pineau and Henderson 2017) · the NeurIPS Reproducibility Checklist · Papers with Code · containerization, seeds, data versioning, dependency pinning · the production-ready research workflow.

Read before the lecture

Pineau et al., *Improving Reproducibility in Machine Learning Research* (JMLR 2021)

Recitation — paper discussion

Nature editorial, *1,500 scientists lift the lid on reproducibility* (Nature 2016) (paper)

Come ready to argue one side of each:

Has ML reproducibility improved between 2016 and 2026?
What would a useful reproducibility benchmark look like?

Reference text for this week: chapter 11 of the bilingual notes — EN PDF · FR PDF.