Machine Learning-based Voting Classifier for Improving Sentiment Analysis on Twitter Data

Authors

  • Huatao Li

DOI:

https://doi.org/10.62051/nfkz3035

Keywords:

Machine learning; sentiment analysis; voting classifier.

Abstract

As the number of individuals sharing their thoughts on Twitter continues to grow, comprehending the underlying sentiment behind these tweets becomes increasingly crucial for researchers. To identify the optimal model capable of accurately distinguishing tweet sentiment, the author uses a dataset published in 2022, containing tweet texts annotated with corresponding sentiments. Six basic machine learning classification methods are used for model training: Logistic Regression, Naïve Bayes Classifier, Support Vector Classifier, Decision Tree Classifier, Random Forest Classifier, and K-Nearest Neighbors Classifier. Subsequently, the author assesses the trained models. Through the validation, the author finds that the Logistic Regression, Support Vector Classifier, and Random Forest Classifier perform the highest accuracy and F1-score, and the differences between these three models are small. To improve the model, the author votes the best three models together to build a new model. This model’s accuracy and F1-score are better than all the basic models, and the accuracy and F1-score have all reached 71.6%. The research shows the differences between each model and the best model when distinguishing between positive tweets, neutral tweets, and negative tweets.

Downloads

Download data is not yet available.

References

Boon-Itt, Sakun, and Yukolpat Skunkan. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 2020, 6(4): e21978. DOI: https://doi.org/10.2196/21978

La Gatta, Valerio, et al. COVID-19 Sentiment Analysis Based on Tweets. IEEE Intelligent Systems, 2023 38(3): 51-55. DOI: https://doi.org/10.1109/MIS.2023.3239180

Qi, Yuxing, and Zahratu Shabrina. Sentiment analysis using Twitter data: a comparative application of lexicon-and machine-learning-based approach. Social Network Analysis and Mining, 2023, 13(1): 31. DOI: https://doi.org/10.1007/s13278-023-01030-x

Mostafa, Lamiaa. Egyptian student sentiment analysis using Word2vec during the coronavirus (Covid-19) pandemic. Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020, 2021. DOI: https://doi.org/10.1007/978-3-030-58669-0_18

Al Amrani, Yassine, Mohamed Lazaar, and Kamal Eddine El Kadiri. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Computer Science, 2018, 127: 511-520. DOI: https://doi.org/10.1016/j.procs.2018.01.150

Adwan, O. M. A. R., et al. Twitter sentiment analysis approaches: A survey." International Journal of Emerging Technologies in Learning (iJET), 2020, 15(15): 79-93. DOI: https://doi.org/10.3991/ijet.v15i15.14467

Arun, K., and A. Srinagesh. Multi-lingual Twitter sentiment analysis using machine learning. Int. J. Electr. Comput. Eng, 2020, 10(6): 5992-6000. DOI: https://doi.org/10.11591/ijece.v10i6.pp5992-6000

Zahoor, Sheresh, and Rajesh Rohilla. Twitter sentiment analysis using lexical or rule based approach: a case study. 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2020. DOI: https://doi.org/10.1109/ICRITO48877.2020.9197910

Xu, Qianwen Ariel, Victor Chang, and Chrisina Jayne. A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal, 2022, 3: 100073. DOI: https://doi.org/10.1016/j.dajour.2022.100073

Aqlan, Ameen Abdullah Qaid, B. Manjula, and R. Lakshman Naik. A study of sentiment analysis: concepts, techniques, and challenges. Proceedings of International Conference on Computational Intelligence and Data Engineering: Proceedings of ICCIDE 2018, 2019. DOI: https://doi.org/10.1007/978-981-13-6459-4_16

Alessia D'Andrea, et al. Approaches, tools and applications for sentiment analysis implementation. International Journal of Computer Applications, 2015, 125(3): 1-8. DOI: https://doi.org/10.5120/ijca2015905866

Usop, Eka Surya, R. Rizal Isnanto, and Retno Kusumaningrum. Part of speech features for sentiment classification based on Latent Dirichlet Allocation. 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE). 2017. DOI: https://doi.org/10.1109/ICITACEE.2017.8257670

Downloads

Published

12-08-2024

How to Cite

Li, H. (2024) “Machine Learning-based Voting Classifier for Improving Sentiment Analysis on Twitter Data”, Transactions on Computer Science and Intelligent Systems Research, 5, pp. 1–9. doi:10.62051/nfkz3035.