Week 10 — Model Monitoring in Production

A deployed model does not self-regulate. It drifts, finds shortcuts, gets exposed to populations the training data never saw.

MLOps  ·  schedule  ·  Week 10 of 12 ·  ← 09 ·  11 →

Week 10 — Model Monitoring in Production

A deployed model does not self-regulate. It drifts, finds shortcuts, gets exposed to populations the training data never saw.

Lecture

Performance metrics in production (latency, throughput, error rate) · data drift detection (Kolmogorov-Smirnov, Wasserstein, Population Stability Index) · concept drift · alerting (Grafana, Prometheus, PagerDuty) · the Amazon hiring-model failure as a case study · fairness monitoring (Fairlearn, Evidently).

Read before the lecture

Recitation — paper discussion

Dastin, *Amazon scraps secret AI recruiting tool that showed bias against women* (Reuters 2018) (paper)

Come ready to argue one side of each:

  • What would the monitoring system have looked like that caught this?
  • Is monitoring sufficient, or is the problem upstream in training data?

Reference text for this week: chapter 10 of the bilingual notes — EN PDF · FR PDF.