Authorship Attribution of Malware Binaries
Y.J.I. de Boer (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Sicco Verwer – Mentor (TU Delft - Cyber Security)
Reginald Lagendijk – Graduation committee member (TU Delft - Cyber Security)
Maurício Aniche – Graduation committee member (TU Delft - Software Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Attribution of the malware to the developers writing the malware is an important factor in cybercrime investigative work. Clustering together not only malware of the same family, but also inter-family malware relations together provides more information about the authors and aid further malware analysis work. In this report, previous work which concluded attribution on compiled binaries can be done by a programmer their style is questioned. Given insight on this matter, this report explores new clustering techniques for both static and dynamically derived features from malware binaries. Both methods are complementary as they provide very different types of data. In the static analysis, the data for the similarity comparison is derived from disassembled binaries, while in dynamic analysis the choice was made to record system calls executed by the malware during ecution.
We use a finer granularity than when comparing the data of the complete binaries with each other, such that instead of differences, fine similarities among malware families can be found. Evaluation of clusters is a difficult subject, because of its unsupervised nature and data quality related causes. However, upon manual inspection of the generated clusters, the newly developed clustering methods confirm previously discovered similarities but also find new connections among malware families.