Optimizing Science Question Ranking through Model and Retrieval-Augmented Generation

Ye Zhang; Mengran Zhu; Yulu Gong; Rui Ding

doi:10.62051/ijcsit.v1n1.17

Authors

Ye Zhang
Mengran Zhu
Yulu Gong
Rui Ding

DOI:

https://doi.org/10.62051/ijcsit.v1n1.17

Keywords:

Large language model, OpenBookQA, ranking, Science-based questions, Platypus2-70B

Abstract

This paper delves into the challenges of discerning optimal answers from science-based questions generated by large language models (LLM), particularly emphasizing the intricate task of ranking. Employing the MAP@3 evaluation metric and drawing from the OpenBookQA dataset, the study explores modeling strategies and highlights the exceptional performance of the Platypus2-70B model. Equipped with a state-of-the-art text encoder, Platypus2-70B achieves an impressive score of 0.909904, setting a benchmark for excellence in future large language model competitions. The paper goes beyond a mere description of model architectures and experimental results, offering a comprehensive journey that envisions the transformative impact of large-scale language models on the landscape of natural language understanding, especially within the intricate domains of scientific exploration.

Downloads

Download data is not yet available.

References

Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., & Mercer, R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. https://bit.ly/3RNCRSY.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. https://bit.ly/3Sk8XUV.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805.

Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461. https://arxiv.org/abs/1910.13461.

Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. https://www.mikecaptain.com/resources/pdf/GPT-1.pdf.

Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., ... & Dean, J. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. https://arxiv.org/abs/1609.08144.

Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., ... & Yih, W. T. (2020). Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906. https://arxiv.org/abs/2004.04906.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1), 5485-5551. https://dl.acm.org/doi/abs/10.5555/3455716.3455856.

Le, Q., & Mikolov, T. (2014, June). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188-1196). PMLR. https://bit.ly/3UjkSoQ.

Ashish, V. (2017). Attention is all you need. arXiv preprint arXiv: 1706.03762. https://bit.ly/48THJeZ.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. https://bit.ly/48TpJBB.

National Academies of Sciences, Engineering, and Medicine. (2019). Quantum computing: progress and prospects. https://bit.ly/3SzTilU.

De la Hoz, R., Villar, S., Castillo, K., Castellón, J., & Coronado, K. (2022). Factores clave para el éxito de ciudades inteligentes y sostenibles: una revisión sistemática de la literatura. INVENTUM, 17(33), 44-54. https://revistas.uniminuto.edu/index.php/Inventum/article/view/3141.

Xinyu Zhao, et al. “Effective Combination of 3D-DenseNet’s Artificial Intelligence Technology and Gallbladder Cancer Diagnosis Model”. Frontiers in Computing and Intelligent Systems, vol. 6, no. 3, Jan. 2024, pp. 81-84, https://doi.org/10.54097/iMKyFavE.

Shulin Li, et al. “Application Analysis of AI Technology Combined With Spiral CT Scanning in Early Lung Cancer Screening”. Frontiers in Computing and Intelligent Systems, vol. 6, no. 3, Jan. 2024, pp. 52-55, https://doi.org/10.54097/LAwfJzEA.