Feature Selection and Regression for House Value Prediction

Authors

  • Zhaowen Gu

DOI:

https://doi.org/10.62051/cman5j03

Keywords:

Feature selection, Regression, House value prediction.

Abstract

This report investigates the application of feature selection and regression techniques, including LASSO, Ridge Regression, and Elastic Net, in predicting house values. By analyzing these methods, we aim to identify the most influential factors driving house prices and assess their relative importance. Our study involves a comprehensive evaluation of various linear regression models to determine their effectiveness and consistency in predicting house prices. The results reveal that while all examined techniques demonstrate robust performance, certain methods, particularly Elastic Net, offer superior predictive accuracy and better handle multicollinearity among features. This report provides valuable insights for real estate professionals and data scientists seeking to refine their predictive models and make informed decisions based on comprehensive data analysis.

Downloads

Download data is not yet available.

References

[1] Regression Shrinkage and Selection via the Lasso (Tibshirani, 1996)

[2] Ridge Regression: Biased Estimation for Nonorthogonal Problems (Hoerl & Kennard, 1970)

[3] Regularization and Variable Selection Via the Elastic Net (Zou & Hastie, 2005)

[4] "Best Subset, Forward Stepwise, or Lasso? Analysis and Recommendations Based on Extensive Comparisons" (Hastie, Tibshirani, & Tibshirani)

[5] Gene selection for cancer classification using support vector machines (Guyon et al., 2002)

[6] Wrappers for feature subset selection (Kohavi & John, 1997)

[7] Regularization Paths for Generalized Linear Models via Coordinate Descent (Friedman et al., 2010)

[8] Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. *Technometrics*, 12(1), 55-67.

[9] Zou, H., & Hastie, T. (2005). Regularization and Variable Selection Via the Elastic Net. *Journal of the Royal Statistical Society: Series B (Statistical Methodology)*, 67(2), 301-320.

[10] Hastie, T., Tibshirani, R., & Tibshirani, R. (2009). Best Subset, Forward Stepwise, or Lasso? Analysis and Recommendations Based on Extensive Comparisons. *The Statisticians*, 2(3), 406-413.

[11] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene Selection for Cancer Classification Using Support Vector Machines. *Machine Learning*, 46(1), 389-422.

[12] Kohavi, R., & John, G. H. (1997). Wrappers for Feature Subset Selection. *Artificial Intelligence*, 97(1-2), 273-324.

[13] Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. *Journal of Statistical Software*, 33(1), 1-22.

[14] Breiman, L. (2001). Random Forests. *Machine Learning*, 45(1), 5-32.

[15] Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. *Machine Learning*, 20(3), 273-297.

[16] Bishop, C. M. (2006). Pattern Recognition and Machine Learning. *Springer*.

[17] Friedman, J., & Meulman, J. (2003). Multiple Additive Regression Trees with Applications to Heteroscedasticity. *Statistical Modelling*, 3(3), 191-210.

[18] Zhang, H., & Zhang, L. (2011). A Study of Feature Selection Based on Elastic Net Regularization. *Journal of Machine Learning Research*, 12, 3009-3036.

Downloads

Published

24-10-2024

How to Cite

Gu, Z. (2024) “Feature Selection and Regression for House Value Prediction”, Transactions on Computer Science and Intelligent Systems Research, 8, pp. 153–166. doi:10.62051/cman5j03.