Clustering Malware's Network Behavior using Simple Sequential Features

Master Thesis (2018)
Author(s)

A. Nadeem (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S.E. Verwer – Mentor

C. Hernandez Ganan – Mentor

Zaid Al-Ars – Coach

P.H. Hartel – Coach

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2018 Azqa Nadeem
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 Azqa Nadeem
Graduation Date
10-09-2018
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Cyber Security']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Developing malware variants is extremely cheap for attackers because of the availability of various obfuscation tools. These variants can be grouped in malware families, based on information retrieved from their static and dynamic analysis. Dynamic, network-level analysis of malware shows its core behavior since it captures the interaction with its developer. On the other hand, increasingly more emphasis is given to using Deep Packet Inspection (DPI) in order to cluster malware’s network behavior. However, DPI has severe privacy implications, as it involves inspecting payloads of the network traffic.

This thesis presents an exploratory study, the aim of which is to characterize and cluster malware behavior using high-level, non-privacy-invasive, sequential features extracted from its network activity. The key intuition behind the proposed solution is that if the underlying infrastructure of distinct malware samples is similar, the order in which they perform certain actions should also be similar. The results of this research show that sequence clustering allows flexible and robust clusters, as opposed to using non-sequential features. The clusters themselves reveal interesting attacking capabilities, such as port scans, and the same Command and Control server responding to different malware families. Lastly, a comparison with clusters obtained from static analysis reveals that network-based clustering is far more qualified to determine the many behaviors exhibited by a single malware family, as well as behaviors common across multiple malware families.

Files

AN_thesis_anon.pdf
(pdf | 5.93 Mb)
License info not available