MalPaCA: Malware behaviour analysis using unsupervised machine learning
Comparative analysis of various clustering algorithms on determining the best performance in terms of network behaviour discovery
H.J. de Heer (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Azqa Nadeem – Mentor (TU Delft - Cyber Security)
Sicco Verwer – Graduation committee member (TU Delft - Cyber Security)
M.A. Migut – Coach (TU Delft - Computer Science & Engineering-Teaching Team)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
MalPaCA makes use of unsupervised machine learning to provide malware capability assessment by clustering the temporal behaviour of malware network packet traces. A comparative analysis was performed on various clustering algorithms to determine the best clustering algorithm in terms of network behaviour discovery. The clustering algorithms included in the analysis were HDBSCAN, OPTICS, Agglomerative Hierarchical Clustering and K-medoids. Metrics that capture cluster separation, cohesion, purity and completeness were used to determine the performance of the clustering algorithms. Agglomerative Hierarchical Clustering had the lowest total error of 0.950 in the comparative analysis compared to the baseline HDBScan with an error of 1.381.