Application of Five Machine Learning Techniques for Stellar Classification on SDSS DR18
DOI:
https://doi.org/10.62051/v7mpk191Keywords:
Machine Learning; Stellar Classification; SDSS Dataset; Explainability.Abstract
Systematic classification of celestial objects is a cornerstone of modern astronomy, as it underpins the understanding of the universe’s large-scale structure and long-term evolution. Moreover, it facilitates the discovery of rare or extreme phenomena, offering clues to the underlying physical laws that govern them. This study presents a comprehensive evaluation of five machine learning models—Random Forest, Adaptive Boosting, Extreme Gradient Boosting, Deep Neural Network, and TabNet—on the Sloan Digital Sky Survey Data Release 18 dataset for the classification of stellar objects into GALAXY, STAR, and Quasi-Stellar Object. With comparative analysis, although the models are not specially tuned, all models achieved competitive performance with accuracy above 96.8%, with Extreme Gradient Boosting reaching an accuracy 98.5%. The reasons for classification are explained by feature importance analysis with SHAP values, which revealed that redshift and the color index E(u−g) are the most influential attributes in distinguishing object classes. Additionally, the study identified several confusing features that contribute to misclassification, suggesting the need for improved data quality in these dimensions. The research highlights the dual role of machine learning in both automation and discovery-driven astronomy.
Downloads
References
[1] Cheng Qiyun, Sun Caixin, Zhang Xiaoxing, et al. Short-Term load forecasting model and method for power system based on complementation of neural network and fuzzy logic. Transactions of China Electrotechnical Society, 2004, 19 (10): 53 - 58.
[2] Li Xin, Tu Liangping, Li Juan, Gao Xiang, Feng Xueqi, Zhong Zhengdi, “Xception-AS: An Automatic Object Classification Algorithm Based on the Structure of Xception Model,” Astron. Res. Technol., vol. 2023, no. 20 (3), pp. 267 - 274.
[3] Hausen, R., & Robertson, B. E., “Morpheus: A Deep Learning Framework for the Pixel-level Analysis of Astronomical Image Data,” Astrophys. J. Suppl. Ser., vol. 248, no. 1, p. 20, May 2020.
[4] Dey, B., Andrews, B. H., Newman, J. A., Mao, Y.-Y., Rau, M. M., & Zhou, R., “Photometric redshifts from SDSS images with an interpretable deep capsule network,” Mon. Not. R. Astron. Soc., vol. 515, no. 4, pp. 5285– 5305, Aug. 2022.
[5] Kiker, T. J., Steiner, J. F., Garraffo, C., Méndez, M., & Zhang, L., “QPOML: a machine learning approach to detect and characterize quasi-periodic oscillations in X-ray binaries,” Mon. Not. R. Astron. Soc., vol. 524, no. 4, pp. 4801 – 4818, Jul. 2023.
[6] Almeida, A. et al., “The Eighteenth Data Release of the Sloan Digital Sky Surveys: Targeting and First Spectra from SDSS-V,” Astrophys. J. Suppl. Ser., vol. 267, no. 2, p. 44, Aug. 2023.
[7] “SDSS Filters,” SDSS Filters. Accessed: May 02, 2025. [Online]. Available: https://skyserver.sdss.org/dr1/en/proj/advanced/color/sdssfilters.asp.
[8] “Transformations between SDSS magnitudes and other systems.” Accessed: May 02, 2025. [Online]. Available: https://www.sdss4.org/dr17/algorithms/sdssUBVRITransform/.
[9] Sloan Digital Sky Survey, “Data Release 18 [Data Set],” 2023. Accessed: May 02, 2025. [Online]. Available: https://www.sdss.org/dr18/.
[10] Planck Collaboration et al., “Planck 2018 results: VI. Cosmological parameters (Corrigendum),” Astron. Astrophys., vol. 652, p. C4, Aug. 2021.
[11] Breiman, L., “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5 – 32, Oct. 2001.
[12] Freund, Y., & Schapire, R. E., “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” J. Comput. Syst. Sci., vol. 55, no. 1, pp. 119 – 139, Aug. 1997.
[13] Chen, T., & Guestrin, C., “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA: ACM, Aug. 2016, pp. 785 – 794.
[14] Lundberg, S. M., & Lee, S.-I., “A Unified Approach to Interpreting Model Predictions,” in Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc., 2017. Accessed: May 04, 2025.
Downloads
Published
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Transactions on Computer Science and Intelligent Systems Research

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







