Week 06 — Decision Trees
Breiman, Friedman, Olshen, Stone 1984: classification by sequential yes/no questions. The transparent classifier.
Week 06 — Decision Trees
Breiman, Friedman, Olshen, Stone 1984: classification by sequential yes/no questions. The transparent classifier.
Lecture
Decision trees as space partitions · the splitting criteria (Gini, entropy, MSE) · greedy growth · cost-complexity pruning · tree stability and its consequences for the next chapter.
Read before the lecture
- Hastie, Tibshirani, Friedman, chapter 9
Problem set
PS3 — Trees and their limits
- Prove that the empirical Gini impurity is bounded between 0 and $1 - 1/k$ for $k$ classes.
- Show that decision trees are unstable: small data perturbations can change the tree structure dramatically.
- Construct a dataset where a linear classifier beats a tree, and one where the reverse is true.
Reference text for this week: chapter 06 of the bilingual notes — EN PDF · FR PDF.