Clustering Malware's Network Behavior using Simple Sequential Features

Nadeem, A.

Clustering Malware's Network Behavior using Simple Sequential Features

Master thesis (2018)

Authors

A. Nadeem Electrical Engineering, Mathematics and Computer Science

Contributors

SE Verwer (mentor)

Carlos Hernandez Ganan (mentor)

Zaid Al-Ars (coach)

P.H. Hartel (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Sequence Clustering Network Analysis Malware families

To reference this document use:

http://resolver.tudelft.nl/uuid:c8a221b9-9289-4978-a356-af64d8f2c5e0

More Info

expand_more

Published Date

10-09-2018

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Developing malware variants is extremely cheap for attackers because of the availability of various obfuscation tools. These variants can be grouped in malware families, based on information retrieved from their static and dynamic analysis. Dynamic, network-level analysis of malware shows its core behavior since it captures the interaction with its developer. On the other hand, increasingly more emphasis is given to using Deep Packet Inspection (DPI) in order to cluster malware’s network behavior. However, DPI has severe privacy implications, as it involves inspecting payloads of the network traffic.

This thesis presents an exploratory study, the aim of which is to characterize and cluster malware behavior using high-level, non-privacy-invasive, sequential features extracted from its network activity. The key intuition behind the proposed solution is that if the underlying infrastructure of distinct malware samples is similar, the order in which they perform certain actions should also be similar. The results of this research show that sequence clustering allows flexible and robust clusters, as opposed to using non-sequential features. The clusters themselves reveal interesting attacking capabilities, such as port scans, and the same Command and Control server responding to different malware families. Lastly, a comparison with clusters obtained from static analysis reveals that network-based clustering is far more qualified to determine the many behaviors exhibited by a single malware family, as well as behaviors common across multiple malware families.

Files

AN_thesis_anon.pdf

(pdf | 5.93 Mb)

License info not available