NBA Player Comprehensive Score Prediction based on Linear Regression and Random Forest

Authors

  • Junliang Lyu
  • Ouze Wang
  • Yinghao Zhang

DOI:

https://doi.org/10.62051/vqn9s032

Keywords:

Machine learning; Random Forest; Linear regression.

Abstract

This paper introduces machine learning models for constructing a comprehensive quality prediction model to forecast National Basketball Association (NBA) players' specific scores. The objective is to facilitate the analysis, guidance, and evaluation of players' relevant value. Initially, data processing involves renaming the dataset and dividing it into an 80% training set and a 20% dataset through preprocessing, simultaneously addressing missing row. Subsequent steps include visualizing the data and conducting correlation analysis by group, producing a correlation heatmap to mitigate multicollinearity issues. Based on the visualization chart's summary, a tentative ability map of players across different positions is delineated, covering rebounds, assists, and other aspects. Employing random forest and linear regression methods, NBA player data is utilized to train the model, followed by comparison of different models' performances and analysis of their respective strengths. Histograms and linear graphs for the linear regression and random forest models are derived, with random forest exhibiting superior fitting to the data, indicating more accurate predictions compared to linear regression. For future projects, the aim is to employ a diverse range of models for comprehensive data analysis and utilize various evaluation methods for detailed assessments of the models.

Downloads

Download data is not yet available.

References

V. Sarlis, C. Tjortjis, Sports analytics—Evaluation of basketball players and team performance, Information Systems, 93, 2020, p: 101562.

K. Wheeler, Predicting NBA Player Performance, Wheeler-PredictingNBAPlayerPerformance, 2012.

M. Casals, A. J. Martinez, Modelling player performance in basketball through mixed models, International Journal of performance analysis in sport, 13(1), 2013, pp: 64-82.

C. Ge, et al, Predicting the outcome of NBA playoffs based on the maximum entropy principle, Entropy, 18(12), 2016, p: 450.

L. Bernard, E. Bednar, and W. Bauer, Predicting NBA games using neural networks, Journal of Quantitative Analysis in Sports, 5(1), 2009.

J. Lu, Y. Chen, and Y. Zhu, Prediction of future NBA games' point difference: A statistical modeling approach, 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), IEEE, 2019.

F. Paul, and B. Taylor, On estimating the ability of NBA players, Journal of Quantitative analysis in sports, 7(3), 2011.

Z. Albrecht, Basketball predictions in the NCAAB and NBA: Similarities and differences, Statistical Analysis and Data Mining: The ASA Data Science Journal, 9(5), 2016, pp: 350-364.

C. Bryan, et al, Predicting the Betting Line in NBA Games, 2013.

Information on:https://www.kaggle.com/code/amirhosseinmirzaie/nba-players-scored-points-prediction/notebook.

Information on: https://www.geeksforgeeks.org/ml-linear-regression/

Information on: https://en.wikipedia.org/wiki/General_linear_model

Downloads

Published

12-08-2024

How to Cite

Lyu, J., Wang, O. and Zhang, Y. (2024) “NBA Player Comprehensive Score Prediction based on Linear Regression and Random Forest”, Transactions on Computer Science and Intelligent Systems Research, 5, pp. 562–567. doi:10.62051/vqn9s032.