Week 06 — Decision Trees

Breiman, Friedman, Olshen, Stone 1984: classification by sequential yes/no questions. The transparent classifier.

ML  ·  schedule  ·  Week 06 of 12 ·  ← 05 ·  07 →

Week 06 — Decision Trees

Breiman, Friedman, Olshen, Stone 1984: classification by sequential yes/no questions. The transparent classifier.

Lecture

Decision trees as space partitions · the splitting criteria (Gini, entropy, MSE) · greedy growth · cost-complexity pruning · tree stability and its consequences for the next chapter.

Read before the lecture

  • Hastie, Tibshirani, Friedman, chapter 9

Problem set

PS3 — Trees and their limits

  1. Prove that the empirical Gini impurity is bounded between 0 and $1 - 1/k$ for $k$ classes.
  2. Show that decision trees are unstable: small data perturbations can change the tree structure dramatically.
  3. Construct a dataset where a linear classifier beats a tree, and one where the reverse is true.

Reference text for this week: chapter 06 of the bilingual notes — EN PDF · FR PDF.