MalPaCA Feature Engineering
A comparative analysis between automated feature engineering and manual feature engineering on network traffic
S. Park (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Nadeem – Mentor (TU Delft - Cyber Security)
Sicco Verwer – Mentor (TU Delft - Cyber Security)
M.A. Migut – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Identifying novel malware and their behaviour enables security engineers to prevent and protect users with devices on the network from attackers. MalPaCA is an algorithm that helps to understand the behaviours of the network traffic by clustering uni-directional network connections which can be analyzed further to interpret which label suites the malicious connection. When clustering connections, features extracted from the packet information were chosen manually based on the generalizability of information and research of common malware characteristics. The feature set can be extracted automatically with an autoencoder to increase the representation of each packets in network traffics. A comparison with an autoencoder generated feature set to the hand-crafted feature set shows that the hand-crafted feature set represents the malicious traffics with higher accuracy and more insightful explainability. A comparative experiment is run on the IoT-23 dataset, a network traffic capture from Avast’s AIC laboratory.