Research on Sentiment Analysis Based on Ensemble Learning

Authors

  • Kaiyu Li
  • Yi Xie
  • Shicheng Li
  • Zhaomin Liu
  • Di Liu

DOI:

https://doi.org/10.62051/ijcsit.v4n3.17

Keywords:

Sentiment analysis, Integrated learning, Random forest, Gradient boosting tree, Support vector machines

Abstract

The purpose of this paper is to use the ensemble learning method to analyze the sentiment of tourist attraction reviews, so as to improve the reference value of tourists in travel decision-making. With the popularity of online reviews, how to extract emotional information efficiently and accurately has become an important topic. In this study, random forest, gradient boosted tree (GBDT) and support vector machine (SVM) were used as the base learners, and the ensemble strategy of weighted voting was used to construct the model. Optimize parameters with cross-validation and grid search, and comprehensively evaluate model performance with metrics such as AUC, Log Loss, MCC, and average accuracy. Experimental results show that the classification accuracy and robustness of the ensemble model are better than those of the single model, with an AUC of 0.92 and an MCC of 0.78. The balance between precision and recall was demonstrated through performance surface plot analysis based on different thresholds, and the optimal threshold range was determined. The model can provide reliable sentiment insights for tourism platform users, help scenic spot managers identify user needs, optimize services, and have scalability in other review analysis fields.

Downloads

Download data is not yet available.

References

[1] Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2), 1-135.

[2] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.

[3] Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189-1232.

[4] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.

[5] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

[6] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).

[7] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

[8] Aggarwal, C. C., & Zhai, C. (2012). A survey of text clustering algorithms. In Mining text data (pp. 77-128). Springer, Boston, MA.

[9] LIU Bing. (2012). A review of sentiment analysis and opinion mining. Chinese Journal of Computers, 35(6), 1125-1138.

[10] Haiyan Wang, Wei Li, & Liang Zhang. (2018). Textual sentiment analysis of tourist attraction evaluation based on machine learning. Computer Engineering, 44(2), 248-252.

[11] Wu Jiangtao, Wang Wenbin, & Li Cong. (2019). A study on sentiment classification of Chinese reviews based on random forests. Computer Science, 46(2), 84-88.

[12] Zhiming Li, Xu Zhang, & Bing Li. (2020). Application of gradient boosting decision tree in text classification. Journal of Computer Application Research, 37(4), 1023-1028.

[13] ZHOU Xue, ZHAO Feng. (2017). Review of support vector machines in sentiment analysis. Journal of Software, 28(7), 1879-1894.

[14] Chen Weiming, Zhang Xiaoming, Liu Yang. (2015). Research on text classification method based on ensemble learning. Computer Engineering and Applications, 51(4), 117-120.

[15] HUANG Wei, WANG Pengfei. (2019). Research progress on Chinese sentiment analysis based on machine learning. Journal of Information Technology, 38(4), 33-40.

[16] LI Ping, HUANG Xiaojing, LI Ran. (2019). Sentiment analysis of travel reviews based on the combination of deep learning and traditional machine learning. Computer Science, 46(8), 72-77.

Downloads

Published

24-11-2024

Issue

Section

Articles

How to Cite

Li, K., Xie, Y., Li, S., Liu, Z., & Liu, D. (2024). Research on Sentiment Analysis Based on Ensemble Learning. International Journal of Computer Science and Information Technology, 4(3), 160-170. https://doi.org/10.62051/ijcsit.v4n3.17