MalPaCA: Malware behaviour analysis using unsupervised machine learning

None, None

MalPaCA: Malware behaviour analysis using unsupervised machine learning

Comparative analysis of various clustering algorithms on determining the best performance in terms of network behaviour discovery

Bachelor Thesis (2021)

Author(s)

H.J. de Heer (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Nadeem – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

S.E. Verwer – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

M.A. Migut – Coach (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Clustering Malpaca Comparative analysis HDBScan

To reference this document use

https://resolver.tudelft.nl/uuid:254db628-839c-4f99-b9be-91469453076e

More Info

expand_more

Publication Year

2021

Language

English

Graduation Date

01-07-2021

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

256

Collections

thesis

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

MalPaCA makes use of unsupervised machine learning to provide malware capability assessment by clustering the temporal behaviour of malware network packet traces. A comparative analysis was performed on various clustering algorithms to determine the best clustering algorithm in terms of network behaviour discovery. The clustering algorithms included in the analysis were HDBSCAN, OPTICS, Agglomerative Hierarchical Clustering and K-medoids. Metrics that capture cluster separation, cohesion, purity and completeness were used to determine the performance of the clustering algorithms. Agglomerative Hierarchical Clustering had the lowest total error of 0.950 in the comparative analysis compared to the baseline HDBScan with an error of 1.381.

Files

CSE3000_Research_project_Resea... (pdf)

(pdf | 1.42 Mb)

License info not available