Cluster Analysis of Restaurant Evaluation Features Based on UGC Data
DOI:
https://doi.org/10.62051/ijcsit.v4n3.20Keywords:
Data Digging, User Generated Content (UGC), Word to Vector (word2vec), Clustering TechniquesAbstract
With the advent of the big data era, analyzing using User Generated Content (UGC) data has become popular research. Most of the current research is to use UGC data to analyze user preferences or to analyze the influence of UGC data on other users. This paper, however, is based on the perspective of merchants, using UGC data to analyze the advantageous features of merchants that are recognized by users, and to help merchants improve their brand image. In this paper, the original UGC data is preprocessed through Chinese word segmentation, de-duplication and other techniques to generate a review UGC corpus, and then the Word to Vector (word2vec) model is used to train the UGC corpus to obtain the word vectors, and then the words are clustered into word clusters according to the word vectors, and a number of representative words are picked out from each word cluster to generate a thesaurus of restaurant evaluation features. In this paper, 34 restaurant evaluation keywords are finally extracted and labelled as a total of 14 categories of restaurant evaluation features.
Downloads
References
[1] J. M S, Sierra R. The role of user-generated content in tourism decision-making: an exemplary study of Andalusia, Spain [J]. Management Decision, 2024, 62(7):2292-2328.
[2] Yu X, Wang H, Chen Z. The Role of User-Generated Content in the Sustainable Development of Online Healthcare Communities: Exploring the Moderating Influence of Signals [J]. Sustainability, 2024, 16(9):
[3] Bernd W. Wirtz, Vincent Göttel, Paul F. Langer, Marc-Julian Thomas. Antecedents and consequences of public administration’s social media website attractiveness [J]. International Review of Administrative Sciences, 2018, 86(1): 38-61.
[4] Meng L, Kou S, Duan S , et al. The impact of content characteristics of Short-Form video ads on consumer purchase Behavior: Evidence from TikTok [J]. Journal of Business Research, 2024, 183114874-114874.
[5] Bin Fang, Qiang Ye, Deniz Kucukusta, Rob Law. Analysis of the perceived value of online tourism reviews: Influence of readability and reviewer characteristics [J]. Tourism Management, 2016, 52: 498-506.
[6] Jianxin Ma, Chang Zhou, Peng Cui, Hongxia Yang, Wenwu Zhu. Learning Disentangled Representations for Recommendation [A]; Neural Information Processing Systems [C], 2019.
[7] Artem Timoshenko, John R. Hauser. Identifying Customer Needs from User-Generated Content [J]. Marketing Science, 2019, 38(1): 1-192.
[8] Xiao Fang, Paul J. Hu. Top Persuader Prediction for Social Networks [J]. MIS Quarterly, 2018, 42(1): 63-82.
[9] Erick Kauffmann, Jesús Peral, David Gil, Antonio Ferrández, Ricardo Sellers, Higinio Mora. Managing Marketing Decision-Making with Sentiment Analysis: An Evaluation of the Main Product Features Using Text Data Mining [J]. Sustainability, 2019, 11(15): 4235-4253.
[10] Young Kwark, Jianqing Chen, Srinivasan Raghunathan. User-Generated Content and Competing Firms’ Product Design [J]. Management Science, 2018, 64(10): 4471-4965.
[11] Deng Yuan. Intelligent innovative knowledge management integration method based on user generated content [J]. Cluster Computing, 2019, 22: 4793–4803.
[12] Anning Wang, Qiang Zhang, Shuangyao Zhao, Xiaonong Lu, Zhanglin Peng. A review-driven customer preference measurement model for product improvement: sentiment-based importance–performance analysis [J]. Information Systems and e-Business Management, 2020.
[13] Thi T T N, Shurong T. The impact of user-generated content on intention to select a travel destination [J]. Journal of Marketing Analytics, 2022, 11(3):443-457.
[14] Zhang S, Hu Z, Zhu G, et al. Sentiment classification model for Chinese micro-blog comments based on key sentences extraction [J]. Soft Computing, 2020, 25(prepublish):1-14.
[15] Machine Learning; Findings from National University of Defence Science and Technology Broaden Understanding of Machine Learning (Paragraph Vector Representation Based on Word to Vector and CNN Learning) [J]. Computer Weekly News, 2018, 446-.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 International Journal of Computer Science and Information Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







