Machine Learning-based Voting Classifier for Improving Sentiment Analysis on Twitter Data
DOI:
https://doi.org/10.62051/nfkz3035Keywords:
Machine learning; sentiment analysis; voting classifier.Abstract
As the number of individuals sharing their thoughts on Twitter continues to grow, comprehending the underlying sentiment behind these tweets becomes increasingly crucial for researchers. To identify the optimal model capable of accurately distinguishing tweet sentiment, the author uses a dataset published in 2022, containing tweet texts annotated with corresponding sentiments. Six basic machine learning classification methods are used for model training: Logistic Regression, Naïve Bayes Classifier, Support Vector Classifier, Decision Tree Classifier, Random Forest Classifier, and K-Nearest Neighbors Classifier. Subsequently, the author assesses the trained models. Through the validation, the author finds that the Logistic Regression, Support Vector Classifier, and Random Forest Classifier perform the highest accuracy and F1-score, and the differences between these three models are small. To improve the model, the author votes the best three models together to build a new model. This model’s accuracy and F1-score are better than all the basic models, and the accuracy and F1-score have all reached 71.6%. The research shows the differences between each model and the best model when distinguishing between positive tweets, neutral tweets, and negative tweets.
Downloads
References
Boon-Itt, Sakun, and Yukolpat Skunkan. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 2020, 6(4): e21978. DOI: https://doi.org/10.2196/21978
La Gatta, Valerio, et al. COVID-19 Sentiment Analysis Based on Tweets. IEEE Intelligent Systems, 2023 38(3): 51-55. DOI: https://doi.org/10.1109/MIS.2023.3239180
Qi, Yuxing, and Zahratu Shabrina. Sentiment analysis using Twitter data: a comparative application of lexicon-and machine-learning-based approach. Social Network Analysis and Mining, 2023, 13(1): 31. DOI: https://doi.org/10.1007/s13278-023-01030-x
Mostafa, Lamiaa. Egyptian student sentiment analysis using Word2vec during the coronavirus (Covid-19) pandemic. Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020, 2021. DOI: https://doi.org/10.1007/978-3-030-58669-0_18
Al Amrani, Yassine, Mohamed Lazaar, and Kamal Eddine El Kadiri. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Computer Science, 2018, 127: 511-520. DOI: https://doi.org/10.1016/j.procs.2018.01.150
Adwan, O. M. A. R., et al. Twitter sentiment analysis approaches: A survey." International Journal of Emerging Technologies in Learning (iJET), 2020, 15(15): 79-93. DOI: https://doi.org/10.3991/ijet.v15i15.14467
Arun, K., and A. Srinagesh. Multi-lingual Twitter sentiment analysis using machine learning. Int. J. Electr. Comput. Eng, 2020, 10(6): 5992-6000. DOI: https://doi.org/10.11591/ijece.v10i6.pp5992-6000
Zahoor, Sheresh, and Rajesh Rohilla. Twitter sentiment analysis using lexical or rule based approach: a case study. 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2020. DOI: https://doi.org/10.1109/ICRITO48877.2020.9197910
Xu, Qianwen Ariel, Victor Chang, and Chrisina Jayne. A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal, 2022, 3: 100073. DOI: https://doi.org/10.1016/j.dajour.2022.100073
Aqlan, Ameen Abdullah Qaid, B. Manjula, and R. Lakshman Naik. A study of sentiment analysis: concepts, techniques, and challenges. Proceedings of International Conference on Computational Intelligence and Data Engineering: Proceedings of ICCIDE 2018, 2019. DOI: https://doi.org/10.1007/978-981-13-6459-4_16
Alessia D'Andrea, et al. Approaches, tools and applications for sentiment analysis implementation. International Journal of Computer Applications, 2015, 125(3): 1-8. DOI: https://doi.org/10.5120/ijca2015905866
Usop, Eka Surya, R. Rizal Isnanto, and Retno Kusumaningrum. Part of speech features for sentiment classification based on Latent Dirichlet Allocation. 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE). 2017. DOI: https://doi.org/10.1109/ICITACEE.2017.8257670
Downloads
Published
Conference Proceedings Volume
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







