Real-world Applications of Bandit Algorithms: Insights and Innovations

Authors

  • Qianqian Zhang

DOI:

https://doi.org/10.62051/ge4sk783

Keywords:

Multi-armed bandit, real-world applications, recommendation system.

Abstract

In the rapidly evolving landscape of decision-making systems, the significance of Multi-Armed Bandit (MAB) algorithms has surged, showcasing a remarkable ability to address the exploration-exploitation dilemma across diverse domains. Originating from the probabilistic and statistical decision-making framework, MAB algorithms have established a critical role by offering a systematic approach to making choices in uncertain environments with limited information. These algorithms ingeniously balance the trade-off between exploiting known resources for immediate gains and exploring new possibilities for future benefits. The spectrum of MAB algorithms ranges from Stochastic Stationary Bandits, dealing with static reward distributions, to more complex forms like Restless and Contextual Bandits, each tailored to the dynamism and specificity of real-world challenges. Further, Structured Bandits explore the underlying patterns in reward distributions, providing strategic insights into decision-making processes. The practical applications of these algorithms span several fields, including healthcare, content recommendation, and education, demonstrating their versatility and efficacy in addressing specific contextual challenges. This paper aims to provide a comprehensive overview of the development, nuances, and practical applications of MAB algorithms, highlighting their pivotal role in advancing decision-making processes amidst uncertainty.

Downloads

Download data is not yet available.

References

Villar S S, Bowden J and Wason J. (2015). Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges Stat Sci 30 199–215.

Aziz M, Kaufmann E and Riviere M-K. (2021). On Multi-Armed Bandit Designs for Dose-Finding Trials Journal of Machine Learning Research 22 1–38.

Varatharajah Y and Berry B. (2022). A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials Life 12 1277.

Gisselbrecht T, Lamprier S and Gallinari P. (2016). Dynamic Data Capture from Social Media Streams: A Contextual Bandit Approach Proceedings of the International AAAI Conference on Web and Social Media: 10, 131–40.

Iacob A, Cautis B and Maniu S. (2022). Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) Proceedings (Society for Industrial and Applied Mathematics). 513–21.

Yang Z, Liu X and Ying L. (2024). Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment Journal of Machine Learning Research, 25: 1–55.

Yang S, Wang H, Zhang C and Gao Y. (2020). Contextual Bandits. With Hidden Features to Online Recommendation via Sparse Interactions IEEE Intell. Syst. 35: 62–72.

Gan M and Kwon O-C 2022 A knowledge-enhanced contextual bandit approach for personalized recommendation in dynamic domains Knowledge-Based Systems 251 109158.

Wang S, Liu Q, Ge T, Lian D and Zhang Z. (2021). A Hybrid Bandit Model with Visual Priors for Creative Ranking in Display Advertising Proceedings of the Web Conference 2021 WWW ’21, (New York, NY, USA: Association for Computing Machinery). 2324–34.

Wang C, Ye Z, Feng Z, Badanidiyuru Varadaraja A and Xu H. (2023). Follow-ups Also Matter: Improving Contextual Bandits via Post-serving Contexts Advances in Neural Information Processing Systems, 36: 12774–96.

Aziz M, Kaufmann E and Riviere M-K, (2021). On Multi-Armed Bandit Designs for Dose-Finding Trials Journal of Machine Learning Research. 22: 1–38.

Xu J, Xing T and van der Schaar M, (2016). Personalized Course Sequence Recommendations IEEE Transactions on Signal Processing, 64: 5340–52.

Zhu, X., Huang, Y., Wang, X., & Wang, R. (2023). Emotion recognition based on brain-like multimodal hierarchical perception. Multimedia Tools and Applications, 1-19

Intayoad W, Kamyod C and Temdee P. (2020). Reinforcement Learning Based on Contextual Bandits for Personalized Online Learning Recommendation Systems Wireless Pers Commun. 115, 2917–32.

Ma D, Wang Y, Chen M and Shen J. (2021). SRACR: Semantic and Relationship-aware Online Course Recommendation 2021 IEEE International Conference on Engineering, Technology & Education (TALE) 2021 IEEE International Conference on Engineering, Technology & Education (TALE), 367–74.

Sui G and Yu Y. (2020). Bayesian Contextual Bandits for Hyper Parameter Optimization IEEE Access, 8: 42971–9.

Li Y, Jiang J, Gao J, Shao Y, Zhang C and Cui B. (2020). Efficient Automatic CASH via Rising Bandits, AAAI, 34: 4763–71.

Tabei G, Ito Y, Kimura T and Hirata K. (2023). Design of Multi-Armed Bandit-Based Routing for in-Network Caching IEEE Access, 11, 82584–600.

Downloads

Published

12-08-2024

How to Cite

Zhang, Q. (2024) “Real-world Applications of Bandit Algorithms: Insights and Innovations”, Transactions on Computer Science and Intelligent Systems Research, 5, pp. 753–758. doi:10.62051/ge4sk783.