MalPaCA: Malware behaviour analysis using unsupervised machine learning

Comparative analysis of various clustering algorithms on determining the best performance in terms of network behaviour discovery

Bachelor Thesis (2021)
Author(s)

H.J. de Heer (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Nadeem – Mentor (TU Delft - Cyber Security)

Sicco Verwer – Graduation committee member (TU Delft - Cyber Security)

M.A. Migut – Coach (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Hugo de Heer
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Hugo de Heer
Graduation Date
01-07-2021
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

MalPaCA makes use of unsupervised machine learning to provide malware capability assessment by clustering the temporal behaviour of malware network packet traces. A comparative analysis was performed on various clustering algorithms to determine the best clustering algorithm in terms of network behaviour discovery. The clustering algorithms included in the analysis were HDBSCAN, OPTICS, Agglomerative Hierarchical Clustering and K-medoids. Metrics that capture cluster separation, cohesion, purity and completeness were used to determine the performance of the clustering algorithms. Agglomerative Hierarchical Clustering had the lowest total error of 0.950 in the comparative analysis compared to the baseline HDBScan with an error of 1.381.

Files

License info not available