Classification of Malicious Document Detection Based on Artificial Intelligence

Xingyu Wang

doi:10.62051/ijcsit.v2n2.10

Authors

Xingyu Wang

DOI:

https://doi.org/10.62051/ijcsit.v2n2.10

Keywords:

Cybersecurity, Document Detection, Artificial Intelligence, Machine Learning

Abstract

Malicious document attacks are one of the severe threats in the current field of cybersecurity. This paper compares traditional and artificial intelligence methods in detecting malicious documents and proposes an application and technical framework of artificial intelligence in malicious document detection (AI-DDNet). First, it introduces various methods of malicious file attacks, including malicious code attacks, object embedding attacks, document vulnerability attacks, and remote link attacks. Then, it systematically elaborates on traditional static and dynamic analysis methods, as well as static and dynamic analysis techniques integrated with artificial intelligence, including machine learning, deep learning, and reinforcement learning. Lastly, it summarizes the challenges and unresolved issues of current technology and looks forward to the future development direction of artificial intelligence in malicious document detection. Through this study, it promotes the innovation and development of cybersecurity technology.

Downloads

Download data is not yet available.

References

Schultz, M.G., Eskin, E., Zadok, F., et al. (2001). Data Mining Methods for Detection of New Malicious Executables. In: 2001 IEEE Symposium on Security and Privacy. pp. 38-49.

Li, W.J., Stolfo, S., Stavrou, A., et al. (2007). A Study of Malcode-Bearing Documents. In: Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, Berlin, Heidelberg. pp. 231-250.

Gao, Y.X., Qi, D.Y. (2011). Analyze and Detect Malicious Code for Compound Document Binary Storage Format. In: 2011 International Conference on Machine Learning and Cybernetics. pp. 593-596.

Smutz, C., Stavrou, A. (2012). Malicious PDF Detection Using Metadata and Structural Features. In: The 28th Annual Computer Security Applications Conference on - ACSAC '12. pp. 239-248.

Srndic, N., Laskov, P. (2013). Detection of Malicious PDF Files Based on Hierarchical Document Structure. In: The 20th Annual Network & Distributed System Security Symposium. pp. 1-16.

Maiorca, D., Ariu, D., Corona, I., et al. (2015). A Structural and Content-Based Approach for a Precise and Robust Detection of Malicious PDF Files. In: 2015 International Conference on Information Systems Security and Privacy. pp. 27-36.

Nissim, N., Cohen, A., Elovici, Y. (2017). ALDOCX: Detection of Unknown Malicious Microsoft Office Documents Using Designated Active Learning Methods Based on New Structural Feature Extraction Methodology. IEEE Transactions on Information Forensics and Security, 12(3), 631-646.

Toth, T., Kruegel, C. (2002). Accurate Buffer Overflow Detection via Abstract Payload Execution. In: International Workshop on Recent Advances in Intrusion Detection. Springer, Berlin, Heidelberg. pp. 274-291.

Li, W., Su, P.R., Shi, Y.F. (2010). A Technique for Detecting Malicious Documents Based on Calculation of Vector Spaces. Journal of the Graduate School of the Chinese Academy of Sciences, (2), 267-274.

Laskov, P., Šrndić, N. (2011). Static Detection of Malicious JavaScript-Bearing PDF Documents. In: The 27th Annual Computer Security Applications Conference on - ACSAC '11. pp. 373-382.

Du, X. (2019). Malicious PDF Document Detection Based on Mixed Feature. Journal on Communications, 40(2), 118-128.

Anderson, H.S., et al. (2018). Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning. Retrieved from https://arxiv.org/pdf/1801.08917.pdf.

Wu, C., et al. (2018). Enhancing Machine Learning Based Malware Detection Model by Reinforcement Learning. In: ICCNS '18: Proceedings of the 8th International Conference on Communication and Network Security. ACM. pp. 74–78.

Pirscoveanu, R.S., Hansen, S.S., Larsen, T.M.T., Stevanovic, M., Pedersen, J.M., Czech, A. (2015). Analysis of Malware Behavior: Type Classification Using Machine Learning. In: 2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA). London, UK. pp. 1-7.

Firdausi, I., et al. (2010). Analysis of Machine Learning Techniques Used in Behavior-Based Malware Detection. In: Proceedings of the 2010 International Conference on Advances in Computing, Control, and Telecommunication Technologies.

Zhang, Z., Qi, P., Wang, W. (2020). Dynamic Malware Analysis with Feature Engineering and Feature Learning. arXiv, version 5, arXiv:1907.07352v5.