Enhancing Stock Market Prediction with Sentiment Analysis Using a BERT-based Model

Yifan Lin; Sen Wang

doi:10.62051/tq61jb84

Authors

Yifan Lin
Sen Wang

DOI:

https://doi.org/10.62051/tq61jb84

Keywords:

Sentiment Analysis; Stock Market Prediction; BERT-Transformer Model; Financial Market.

Abstract

This study explores the prediction of stock price trends in the financial market, emphasizing the impact of investor sentiment and macroeconomic policies. Traditional research often uses mathematical, statistical, or deep learning methods to predict stock prices but overlooks the emotional factors in vast unstructured text data, such as financial news. This paper proposes a Bidirectional Encoder Representations from Transformers (BERT)-Transformer model that integrates sentiment analysis to enhance stock market prediction. Using financial news text data from the Oriental Fortune Network, the BERT model performs sentiment analysis to extract emotional polarity features. These features, combined with stock transaction data, are then input into a Transformer model to predict the index trends. The experimental results demonstrate a 60% accuracy in predicting stock indices' rise and fall, indicating the model's effectiveness while highlighting areas for improvement. The methodology details dataset preprocessing, the BERT- Back Propagation (BP) model structure, and how sentiment classification is combined with stock trading data for predictions. Performance comparisons with other prediction models confirm the benefits of incorporating sentiment analysis. The BERT-Transformer model’s long-term memory capabilities make it a promising tool for financial time series predictions, offering new research ideas and methods for the financial field. Future research will aim to automate the sentiment feature labeling process to save time and improve efficiency.

Downloads

Download data is not yet available.

References

[1] Hiew J.Z.G. Huang X. Mou H. et al. BERT-based Financial Sentiment Index and LSTM-based Stock Return Predictability. 2019, arXiv preprint:1906.09024.

[2] Zhang L. Wang S. Liu B. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2018, 8 (4): e1253.

[3] Socher R. Pennington J. Huang E.H. et al. Semi-supervised recursive autoencoders for predicting sentiment distributions. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011: 151 - 161.

[4] Kalchbrenner N. Grefenstette E. Blunsom P. A convolutional neural network for modelling sentences. 2014, arXiv preprint:1404.2188.

[5] Kim Y. Convolutional neural networks for sentence classification. 2014, arXiv preprint:1408.5882.

[6] Graves A. Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 2005, 18 (5-6): 602 - 610.

[7] Wang J. Yu L.C. Lai K.R. et al. Dimensional sentiment analysis using a regional CNN-LSTM model. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016, 2: 225 - 230.

[8] Mikolov T. Chen K. Corrado G. et al. Efficient estimation of word representations in vector space. 2013, arXiv preprint:1301.3781.

[9] Tang D. Wei F. Qin B. et al. Building large-scale twitter-specific sentiment lexicon: A representation learning approach. Proceedings of Coling 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 2014: 172 - 182.

[10] Le Q.V. Mikolov T. Distributed Representations of Sentences and Documents. International Conference on Machine Learning, 2014: 1188 - 1196.

[11] Devlin J. Chang M.W. Lee K. et al. Bert: Pre-training of deep bidirectional transformers for language understanding. 2018, arXiv preprint:1810.04805.

[12] Kaggle dataset, “Crime Scene Investigation (CSI) 300 Index”, 2022, https://cn.investing.com.

[13] Vaswani A. Shazeer N. Parmar N. et al. Attention is all you need. 2017, arXiv preprint:1706.03762.