Week 09 — Multi-Agent Reinforcement Learning
When more than one agent learns at once: cooperation, competition, and the non-stationarity of every other agent.
Week 09 — Multi-Agent Reinforcement Learning
When more than one agent learns at once: cooperation, competition, and the non-stationarity of every other agent.
Lecture
Markov games as the generalization of MDPs · cooperative vs competitive vs general-sum settings · independent learning vs CTDE (centralized training, decentralized execution) · QMIX and MADDPG · multi-agent emergence.
Read before the lecture
Problem set
PS5 — Multi-agent fundamentals
- Show that independent Q-learning agents in a simple 2-player Markov game can fail to find a Nash equilibrium.
- Implement Iterated Prisoner’s Dilemma with two Q-learning agents. What policies emerge?
Reference text for this week: chapter 09 of the bilingual notes — EN PDF · FR PDF.