SmarTax: An Intelligent Tax Policy Maker via Multi-agent Reinforcement Learning and Irrationality of Humans

Tianlang Xiong

doi:10.62051/m1e5qs94

Authors

Tianlang Xiong

DOI:

https://doi.org/10.62051/m1e5qs94

Keywords:

Irrationality Model; Markov Decision Process; State; Action; Reward Function; Transitional Function; Deep Q-Learning.

Abstract

Effective tax policies are difficult to design since taxpayers’ behaviors often deviate from the traditional economic models, which assume rational decision-making. Tax policymakers need tools that predict how individuals will respond to different tax policies, accounting for the irrational behaviors caused by biases, emotions, and imperfect information. This study introduces SmarTax, the first income tax policy-making simulator which integrates economics models, multi-agent reinforcement learning (MARL), and more importantly, different models of human irrationality. The goal is to maximize a society’s GDP and social equity. Compared to traditional tax policy simulators, SmarTax helps reduce the gap between simulation and real-world scenarios by incorporating irrationality of the populations instead of assuming universal rationality. For example, by modeling various levels of rationality, SmarTax is able to predict how traditional RL-based tax policy-making process might become invalid due to the fact that low-income households might underutilize tax credits due to complexity or a lack of information. The evaluation of SmarTax was conducted on three reinforcement learning (RL) algorithms: Independent Proximal Policy Optimization (IPPO), Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and Bi-level Mean Field Actor-Critic (BMFAC). The results indicate that, first, the inclusion of irrational factors impedes the effectiveness of all three RL algorithms. Second, compared to IPPO and MADDPG, BMFAC performs better when the irrationality level is low (<=0.1): the algorithm can successfully find a tax policy with irrationality levels up to 0.1. However, it is less robust to higher rationality levels. Its performance significantly drops when the irrationality level is beyond 0.1. Hence, SmarTax helps policymakers design transparent, understandable, and inclusive policies, aiming to achieve fairness and economic sustainability in society.

Downloads

Download data is not yet available.

References

[1] Adam Smith. The wealth of nations [1776], volume 11937. na, 1937.

[2] Eric Bonabeau. Agent-based modeling: Methods and techniques for simulating human systems. Proceedings of the national academy of sciences, 99 (suppl_3): 7280 – 7287, 2002.

[3] Roger H. Gordon and Wojciech Kopczuk. Using behavioral economics to inform the optimal taxation of labor income. National Tax Journal, 70 (4): 107 – 126, 2017.

[4] Qirui Mi, Siyu Xia, Yan Song, Haifeng Zhang, Shenghao Zhu, and Jun Wang. Taxai: A dynamic economic simulator and benchmark for multi-agent reinforcement learning. arXiv preprint arXiv: 2309.16307, 2023.

[5] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2 editions, 2018.

[6] Berkeley University of California. Cs188 lecture notes on rl and mdps, 2024.

[7] Stephan Zheng, Alexander Trott, Sunil Srinivasa, David C Parkes, and Richard Socher. The ai economist: Taxation policy design via two-level deep multiagent reinforcement learning. Science advances, 8 (18): eabk2607, 2022.

[8] El Baroudi Nedaa, Fabio Caccioli, and Adrien Vignes. Agent-based modeling in economics and finance: Past, present, and future. Journal of Behavioral and Experimental Finance, 34: 100722, 2022.

[9] AICPA. Guiding principles of good tax policy: A framework for evaluating tax proposals’, 2001.

[10] Haiyang Chen, Hyung Jin Chang, and Andrew Howes. Implications of human irrationality for reinforcement learning. arXiv preprint arXiv: 2006.04072, 2020.

[11] Daniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk. Econometrica, 47 (2): 263 – 291, 1979.

[12] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2018.

[13] R. Bellman. Dynamic Programming. Princeton University Press, 1957.

[14] M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning (ICML – 94), 1994.

[15] David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Ried- miller. Deterministic policy gradient algorithms. In International Conference on Machine Learning (ICML), pages 387 – 395. PMLR, 2014.

[16] John DeNero and Dan Klein. Cs 188: Introduction to artificial intelligence. URL: https://inst. eecs. berkeley. edu/ ~ cs188, 2018.

[17] Christian Schröder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, and Shimon Whiteson. Is independent learning all you need in the starcraft multi-agent challenge? CoRR, abs/2011.09533, 2020.

[18] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms, 2017.

[19] Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Weinan Zhang, and Jun Wang. Bi-level actor-critic for multi-agent coordination, 2020.