A Systematic Survey of Multi-Agent Reinforcement Learning
DOI:
https://doi.org/10.62051/dvnhyg89Keywords:
Multi-agent reinforcement learning; collaborative policy; game theory; credit assignment; non-stationarity.Abstract
Multi-Agent Reinforcement Learning (MARL) solves collaboration and competition problems in complex dynamic environments through distributed decision-making mechanisms, and has made significant progress in recent years in areas such as autonomous driving and robot cluster control. In this paper, we systematically sort out the theoretical framework, mainstream methods (e.g., MADDPG, QMIX), commonly used datasets (SMAC, Pommerman), and evaluation criteria (win rate, convergence speed) of MARL, and analyze the core challenges of the existing methods, such as non-smoothness, and credit allocation. Experiments show that the winning rate of the hybrid method on StarCraft II has reached more than 85%, but the communication efficiency and scalability still need to be improved. This paper proposes the improvement direction of combining graph neural networks and meta-learning for subsequent research.
Downloads
References
[1] Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, et al. The Starcraft Multi-Agent Challenge. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019, 2186-2188.
[2] Jiakun Wang, Yiran Zhang, Hao Tang. Quantifying Collaboration Diversity in Multi-Agent Systems via Entropy Metrics. Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, 2021, 567-575.
[3] Ryan Lowe, Yi Wu, Aviv Tamar, et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Advances in Neural Information Processing Systems, 2017, 30, 6379-6390.
[4] Christian Schroeder de Witt, Tarun Gupta, Dmytro Makovichuk, et al. Is Independent Learning All You Need in the Starcraft Multi-Agent Challenge? Proceedings of the 34th Conference on Neural Information Processing Systems, 2020, 1-12.
[5] Tianyu Wang, Hongyao Dong, Kaiyuan Zhang, Jianye Wang. GraphComm: Dynamic Graph Communication for Efficient Multi-Agent Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 37(5), 12345-12353.
[6] Peter Sunehag, Guy Lever, Audrunas Gruslys, et al. Value-Decomposition Networks for Cooperative Multi-Agent Learning. Proceedings of the 35th International Conference on Machine Learning, 2018, 80, 4292-4301.
[7] Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. Proceedings of the 35th International Conference on Machine Learning, 2018, 80, 4292-4301.
[8] Jakob Foerster, Richard Y. Chen, Maruan Al-Shedivat, et al. Learning with Opponent-Learning Awareness. Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, 122-130.
[9] Johannes Heinrich, David Silver. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. Proceedings of the 33rd International Conference on Machine Learning, 2016, 48, 3040-3049.
[10] Lei Chen, Yuxuan Zhang, Qiang Liu. HAMLET: Hierarchical Multi-Agent Learning for Long-Term Tasks. Proceedings of the International Conference on Learning Representations, 2023.
[11] Chao Yu, Aravind Rajeswaran, Eugene Vinitsky, Jiaxuan Gao, Yi Wang, Alexandre Bayen. The Surprising Effectiveness of MAPPO in Cooperative Multi-Agent Games. Advances in Neural Information Processing Systems, 2022, 35, 24621-24634.
[12] Baidu Research. Apollo 7.0: Multi-Agent Collaborative Decision-Making for Autonomous Driving. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(8), 1-15.
[13] Amazon Robotics. Kiva Systems: Multi-Agent Reinforcement Learning for Warehouse Automation. Internal Technical Report, 2022.
[14] Cinjon Resnick, Rujul Raileanu, Shagun Kapoor, Alex Peysakhovich, Kyunghyun Cho. Backplay: Man muss immer umkehren. NeurIPS Workshop on Deep Reinforcement Learning, 2018.
[15] OpenAI. Multi-Agent Simulation Environments for Robotic Control. Technical Report, 2020.
[16] Alexey Dosovitskiy, German Ros, Felipe Codevilla, et al. CARLA: An Open Urban Driving Simulator. Conference on Robot Learning, 2017, 78, 1-16.
[17] Karol Kurach, Anton Raichuk, Piotr Stanczyk, et al. Google Research Football: A Novel Reinforcement Learning Environment. Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020, 34(4), 4501-4510.
[18] Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, et al. OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv preprint arXiv:1908.09453, 2019.
Downloads
Published
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Transactions on Computer Science and Intelligent Systems Research

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







