Word data prediction based on statistical method
DOI:
https://doi.org/10.62051/92ed6d91Keywords:
ARIMA; Central-center log-ratio transformation; BP neural network; K-means clustering; Spearman correlation coefficient.Abstract
Puzzle game wordle has been wildly popular since its launch, its simple and unique game rules have generated a lot of data while gaining a large number of users in a short period of time, making wordle prediction a hot topic. In this paper, we construct four models based on the properties of words: interpretive prediction report counting model, correlation judgment model, percentage prediction model and word difficulty category classification model to provide a solution for predicting words. In order to measure the difficulty of words, we evaluated the words from two aspects of commonality and structure, and defined sub-indicators such as the harmony of word pronunciation (both American and British), the number of vowels, the number of repeated letters, and the word frequency of words to measure the difficulty of words comprehensively. We built a BP neural network model for multivalue prediction and predicted for a given future puzzle word its correlation to the percentage distribution of each attempt (1, 2, 3, 4, 5, 6, X). We developed a K-means clustering model to classify the difficulty of words. We found that words fall into two categories. The closer a word is to the center of the category; the word belongs to that category. We suggest that in addition to players taking the initiative to advertise and attract traffic, newspapers should also take the initiative to promote the game (such as advertising), firstly warming up the publicity before the game goes live. According to the popularity trend curve fitted by the differential equation model, mass interest in emerging games rises rapidly and, after peaking for a short time, slowly declines due to the loss of novelty. Therefore, in order to maintain public enthusiasm as much as possible, publicity should be stepped up after the peak.
Downloads
References
[1] Guo, Wu, L., & Wang, L. (2024). Prediction and Analysis of Non-linear Data based on BP Neural Network. DOI: https://doi.org/10.1109/ICFTIC59930.2023.10456271
[2] IEEE Xplore. This article provides insights into the application of BP neural networks for prediction in nonlinear data analysis, which is relevant to the BP neural network model built for Wordle prediction.2. K-Means Clustering for Word Difficulty Classification in Wordle. Although a specific paper is not referenced, you can cite various resources on K-means clustering applied toword difficulty classification in the context of Wordle, for example, you can reference tutorials or technical reports that explain the implementation and application ofK-means clustering in similar scenarios.
[3] Wordle Strategies and User Behavior Analysis, you can reference online resources, forums, or blogs that discuss Wordle strategies and user behavior patterns. These resources can provide insights into player strategies, which can inform your prediction models.
[4] Measuring Word Difficulty in Word Games. While there may not be a specific paper directly related to Wordle, you can reference studies that explore methods for measuring word difficulty in word games. These studies often consider factors like word frequencypronunclation, and letter patterns.
[5] Differential Equation Models for Predicting Popularity Trends of Online Games. You can cite studies that use diferential equation models to predict popularity trends, especially in the contextiofonline games. These models can provide insights into maintaining public enthusiasm and timing publicity efforts.
[6] Advertising and Promotion Strategies for Online Games. Reference articles or case studies that discuss advertising and promotion strategies for online games, focusing on how to maintain player interest and attract new users.
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.