Evaluating Machine Learning Models in Housing Price Forecasting

Authors

  • Kailin Wang

DOI:

https://doi.org/10.62051/ev013e77

Keywords:

Housing Price Prediction; Machine Learning; Feature Selection.

Abstract

This study aims to compare the performance of different machine learning algorithms in predicting housing prices and to construct an experimental process based on data preprocessing, feature selection, and comparative analysis of multiple regression models. Initially, a Random Forest will filter out the ten most significant variables. This variable will be utilized to build Multivariable Linear Regression (MLR). To find the best subset, lasso, best subset, and 5-fold Cross-Validation methods will be applied to find the final model. At the same time, this study incorporates training and testing using models such as K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Bagging, Random Forest (RF), Boosting, and Bayesian Additive Regression Trees (BART). By comparing the error performance of the model on the training set and the test set, the mean squared error (MSE) will be calculated to evaluate the strengths and weaknesses of each model in housing price prediction. The goal is to find the modeling method and variable combination that best suits the data set.

Downloads

Download data is not yet available.

References

[1] Geerts M, Vanden Broucke S, De Weerdt J. A survey of methods and input data types for house price prediction. ISPRS International Journal of Geo-Information, 2023, 12(5): 200.

[2] Q. Truong, M. Nguyen, H. Dang, and B. Mei, "Housing price prediction via improved machine learning techniques," Procedia Computer Science, vol. 174, pp. 433–442, 2020.

[3] T. D. Phan, "Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia," 2018 International Conference on Machine Learning and Data Engineering (iCMLDE), Sydney, NSW, Australia, 2018, pp. 35-42.

[4] Q. Zhang, "Housing price prediction based on multiple linear regression," Scientific Programming, vol. 2021, Article ID 7678931, 9 pages, 2021.

[5] N. P. Ariyanti, A. Triayudi, and R. T. K. Sari, "Analysis of K-NN Algorithm and Linear Regression to Predict House Prices in Jabodetabek," SaNa: Journal of Blockchain, NFTs and Metaverse Technology, vol. 2, no. 1, pp. 65–71, Feb. 2024.

[6] A. B. Adetunji, O. N. Akande, F. A. Ajala, O. Oyewo, Y. F. Akande, and G. Oluwadara, "House Price Prediction using Random Forest Machine Learning Technique," Procedia Computer Science, vol. 199, pp. 806–813, 2022, doi: 10.1016/j.procs.2022.01.100.

[7] A. Hjort, I. Scheel, D. E. Sommervoll, and J. Pensar, "Locally interpretable tree boosting: An application to house price prediction," Decision Support Systems, vol. 178, Art. no. 114106, 2024, doi: 10.1016/j.dss.2023.114106.

[8] J. Gu, M. Zhu, and L. Jiang, "Housing price forecasting based on genetic algorithm and support vector machine," Expert Systems with Applications, vol. 38, no. 4, pp. 3383–3386, 2011.

[9] H. A. Chipman, E. I. George, and R. E. McCulloch, BART: Bayesian additive regression trees, The Annals of Applied Statistics, vol. 4, no. 1, pp. 266–298, 2010, doi: 10.1214/09-AOAS285.

[10] Montoya A, DataCanary. House Prices – Advanced Regression Techniques. Kaggle, 2016. Available: https://kaggle.com/competitions/house-prices-advanced-regression-techniques

Downloads

Published

10-07-2025

How to Cite

Wang, K. (2025) “Evaluating Machine Learning Models in Housing Price Forecasting”, Transactions on Computer Science and Intelligent Systems Research, 9, pp. 368–373. doi:10.62051/ev013e77.