Week 03 — Version Control — Git and DVC

Code is the easy part. Data and models version differently.

MLOps  ·  schedule  ·  Week 03 of 12 ·  ← 02 ·  04 →

Week 03 — Version Control — Git and DVC

Code is the easy part. Data and models version differently.

Lecture

Git for ML projects (branches, hooks, submodules) · DVC for data and model versioning · Git LFS, Pachyderm, lakeFS · MLflow Model Registry · the GitOps workflow.

Read before the lecture

Code lab

Lab 2 — Versioning the full ML project

Take an existing ML notebook. Version the code in Git, the dataset in DVC, the trained model artifact in MLflow Model Registry. Tag a v1.0 release that reproduces from scratch.

Notebook: lab02-versioning.ipynb  ·  Dataset: Any prior coursework dataset.


Reference text for this week: chapter 03 of the bilingual notes — EN PDF · FR PDF.