Transformer-Based Video Comment Analysis

Authors

  • Ziyi Hua

DOI:

https://doi.org/10.62051/ijcsit.v4n1.15

Keywords:

Text summarization, Transformer model, mT5_m2o_chinese_simplified_crossSum, Bilibili, Trolley problem, Deep learning, NLP, Comment analysis

Abstract

In this work, we conduct a comprehensive analysis of sentiment in Bilibili comments using a Transformer-based model. We employ the mT5_m2o_chinese_simplified_crossSum model, a multitasking Transformer model specifically designed for summarizing Chinese text. The study involves scraping comments from a video related to the "trolley problem," followed by data cleaning and preprocessing. The preprocessed data is fed into the Transformer model, which generates summaries that accurately reflect the main viewpoints and discussions in the comments. Our results show that the model effectively captures key ethical dilemmas, personal opinions, and emotional responses, providing a nuanced understanding of public sentiment. This research highlights the potential of advanced NLP techniques in processing large volumes of user-generated content, offering valuable insights for businesses and researchers in understanding public opinion and facilitating ethical discussions. The study underscores the importance of optimizing Transformer models for specific tasks, demonstrating their flexibility and performance in text summarization. These findings provide a robust foundation for future applications and innovations in natural language processing.

Downloads

Download data is not yet available.

References

[1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., & Gomez, A. N., et al. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).

[2] Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2020). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7871-7880).

[3] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., ... & Rush, A. M. (2020). Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 38-45).

[4] Jones, K. S. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation.

[5] Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407.

[6] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022.

[7] Nenkova, A., & McKeown, K. (2012). A Survey of Text Summarization Techniques. In Mining Text Data (pp. 43-76). Springer, Boston, MA.˝

[8] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (pp. 4171-4186).

Downloads

Published

13-09-2024

Issue

Section

Articles

How to Cite

Hua, Z. (2024). Transformer-Based Video Comment Analysis. International Journal of Computer Science and Information Technology, 4(1), 121-126. https://doi.org/10.62051/ijcsit.v4n1.15