Guided Malware Sample Analysis Based on Graph Neural Networks

None, None; None, None; None, None; None, None; None, None

Guided Malware Sample Analysis Based on Graph Neural Networks

Journal Article (2023)

Author(s)

Yi Hsien Chen (National Taiwan University)

Si Chen Lin (National Taiwan University)

Szu Chun Huang (National Yang Ming Chiao Tung University, Hsinchu, TU Delft - Technology, Policy and Management)

Chin Laung Lei (National Taiwan University)

Chun Ying Huang

Research Group

Organisation & Governance

Reverse engineering Graph neural network Malware analysis Machine learning for security

DOI related publication

https://doi.org/10.1109/TIFS.2023.3283913 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:ae1da5ba-1673-49fb-8993-4330eed009fb

More Info

expand_more

Publication Year

2023

Language

English

Research Group

Organisation & Governance

Volume number

18

Pages (from-to)

4128-4143

Downloads counter

266

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Malicious binaries have caused data and monetary loss to people, and these binaries keep evolving rapidly nowadays. With tons of new unknown attack binaries, one essential daily task for security analysts and researchers is to analyze and effectively identify malicious parts and report the critical behaviors within the binaries. While manual analysis is slow and ineffective, automated malware report generation is a long-term goal for malware analysts and researchers. This study moves one step toward the goal by identifying essential functions in malicious binaries to accelerate and even automate the analyzing process. We design and implement an expert system based on our proposed graph neural network called MalwareExpert. The system pinpoints the essential functions of an analyzed sample and visualizes the relationships between involved parts. We evaluate our proposed approach using executable binaries in the Windows operating system. The evaluation results show that our approach has a competitive detection performance (97.3% accuracy and 96.5% recall rate) compared to existing malware detection models. Moreover, it gives an intuitive and easy-to-understand explanation of the model predictions by visualizing and correlating essential functions. We compare the identified essential functions reported by our system against several expert-made malware analysis reports from multiple sources. Our qualitative and quantitative analyses show that the pinpointed functions indicate accurate directions. In the best case, the top 2% of functions reported from the system can cover all expert-annotated functions in three steps. We believe that the MalwareExpert system has shed light on automated program behavior analysis.

Files

Guided_Malware_Sample_Analysis... (pdf)

(pdf | 3.29 Mb)

- Embargo expired in 07-12-2023

License info not available