A Systematic Survey of Multi-Agent Reinforcement Learning

Authors

  • Junyi Leng

DOI:

https://doi.org/10.62051/dvnhyg89

Keywords:

Multi-agent reinforcement learning; collaborative policy; game theory; credit assignment; non-stationarity.

Abstract

Multi-Agent Reinforcement Learning (MARL) solves collaboration and competition problems in complex dynamic environments through distributed decision-making mechanisms, and has made significant progress in recent years in areas such as autonomous driving and robot cluster control. In this paper, we systematically sort out the theoretical framework, mainstream methods (e.g., MADDPG, QMIX), commonly used datasets (SMAC, Pommerman), and evaluation criteria (win rate, convergence speed) of MARL, and analyze the core challenges of the existing methods, such as non-smoothness, and credit allocation. Experiments show that the winning rate of the hybrid method on StarCraft II has reached more than 85%, but the communication efficiency and scalability still need to be improved. This paper proposes the improvement direction of combining graph neural networks and meta-learning for subsequent research.

Downloads

Download data is not yet available.

References

[1] Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, et al. The Starcraft Multi-Agent Challenge. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019, 2186-2188.

[2] Jiakun Wang, Yiran Zhang, Hao Tang. Quantifying Collaboration Diversity in Multi-Agent Systems via Entropy Metrics. Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, 2021, 567-575.

[3] Ryan Lowe, Yi Wu, Aviv Tamar, et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Advances in Neural Information Processing Systems, 2017, 30, 6379-6390.

[4] Christian Schroeder de Witt, Tarun Gupta, Dmytro Makovichuk, et al. Is Independent Learning All You Need in the Starcraft Multi-Agent Challenge? Proceedings of the 34th Conference on Neural Information Processing Systems, 2020, 1-12.

[5] Tianyu Wang, Hongyao Dong, Kaiyuan Zhang, Jianye Wang. GraphComm: Dynamic Graph Communication for Efficient Multi-Agent Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 37(5), 12345-12353.

[6] Peter Sunehag, Guy Lever, Audrunas Gruslys, et al. Value-Decomposition Networks for Cooperative Multi-Agent Learning. Proceedings of the 35th International Conference on Machine Learning, 2018, 80, 4292-4301.

[7] Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. Proceedings of the 35th International Conference on Machine Learning, 2018, 80, 4292-4301.

[8] Jakob Foerster, Richard Y. Chen, Maruan Al-Shedivat, et al. Learning with Opponent-Learning Awareness. Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, 122-130.

[9] Johannes Heinrich, David Silver. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. Proceedings of the 33rd International Conference on Machine Learning, 2016, 48, 3040-3049.

[10] Lei Chen, Yuxuan Zhang, Qiang Liu. HAMLET: Hierarchical Multi-Agent Learning for Long-Term Tasks. Proceedings of the International Conference on Learning Representations, 2023.

[11] Chao Yu, Aravind Rajeswaran, Eugene Vinitsky, Jiaxuan Gao, Yi Wang, Alexandre Bayen. The Surprising Effectiveness of MAPPO in Cooperative Multi-Agent Games. Advances in Neural Information Processing Systems, 2022, 35, 24621-24634.

[12] Baidu Research. Apollo 7.0: Multi-Agent Collaborative Decision-Making for Autonomous Driving. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(8), 1-15.

[13] Amazon Robotics. Kiva Systems: Multi-Agent Reinforcement Learning for Warehouse Automation. Internal Technical Report, 2022.

[14] Cinjon Resnick, Rujul Raileanu, Shagun Kapoor, Alex Peysakhovich, Kyunghyun Cho. Backplay: Man muss immer umkehren. NeurIPS Workshop on Deep Reinforcement Learning, 2018.

[15] OpenAI. Multi-Agent Simulation Environments for Robotic Control. Technical Report, 2020.

[16] Alexey Dosovitskiy, German Ros, Felipe Codevilla, et al. CARLA: An Open Urban Driving Simulator. Conference on Robot Learning, 2017, 78, 1-16.

[17] Karol Kurach, Anton Raichuk, Piotr Stanczyk, et al. Google Research Football: A Novel Reinforcement Learning Environment. Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020, 34(4), 4501-4510.

[18] Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, et al. OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv preprint arXiv:1908.09453, 2019.

Downloads

Published

10-07-2025

How to Cite

Leng, J. (2025) “A Systematic Survey of Multi-Agent Reinforcement Learning”, Transactions on Computer Science and Intelligent Systems Research, 9, pp. 120–125. doi:10.62051/dvnhyg89.