Week 09 — Multi-Agent Reinforcement Learning

When more than one agent learns at once: cooperation, competition, and the non-stationarity of every other agent.

RL  ·  schedule  ·  Week 09 of 12 ·  ← 08 ·  10 →

Week 09 — Multi-Agent Reinforcement Learning

When more than one agent learns at once: cooperation, competition, and the non-stationarity of every other agent.

Lecture

Markov games as the generalization of MDPs · cooperative vs competitive vs general-sum settings · independent learning vs CTDE (centralized training, decentralized execution) · QMIX and MADDPG · multi-agent emergence.

Read before the lecture

Problem set

PS5 — Multi-agent fundamentals

  1. Show that independent Q-learning agents in a simple 2-player Markov game can fail to find a Nash equilibrium.
  2. Implement Iterated Prisoner’s Dilemma with two Q-learning agents. What policies emerge?

Reference text for this week: chapter 09 of the bilingual notes — EN PDF · FR PDF.