Anomaly Detection in Network Traffic using Multivariate State Machines
V. Serentellos (TU Delft - Electrical Engineering, Mathematics and Computer Science)
SE Verwer – Mentor (TU Delft - Cyber Security)
R.L. Lagendijk – Graduation committee member (TU Delft - Cyber Security)
Annibale Panichella – Graduation committee member (TU Delft - Software Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Computer networks have nowadays assumed an increasingly important role in the expression of modern human activity through the ongoing rapid development in the field of Information and Communication Technologies (ICT). More and more individual users and businesses around the world are gaining access to networks online, while the range of services offered by these networks span multiple domains of human life, leading them to grow in terms of both size and complexity, and in parallel handle a constantly growing volume of user-generated data. As with every important aspect of human life, computer networks need to be protected from malicious adversaries aiming to degrade the quality of the offered services and acquire unauthorised access to them, so as for their intended functionality to be uneventfully maintained. The broad success of Machine Learning (ML) based techniques in applications originated from a wide range of fields has led to the wide adoption of such techniques in the premises of automated network traffic analysis systems aiming to detect malicious activity within computer networks, with a notable portion of these systems employing solutions inspired from the field of anomaly detection. Such an automated system for anomaly detection in network traffic, attempting to address as many of the major shortcomings of earlier relevant works as possible, constitutes the content of this thesis. In particular, the proposed system is designed to offer fine-grained analysis of the recorded traffic, by leveraging powerful sequential learning models, like multivariate state machines, equipped with well known anomaly detection algorithms in their structure, towards the extraction of benign behavioral profiles from NetFlow traces of aggregated network entities, like hosts or connections, so as to use these profiles towards the identification of any behavior not conforming to them as anomalous. Three publicly available Netflow-based datasets, incorporating a diverse set of cyber attacks, are utilized to evaluate the detection potential of the proposed methodology. First, the effectiveness of multiple different settings of the designed detection system is quantified, so that the configurations with the most promising detection potential can be identified. Subsequently, the proposed system is compared with various easily developed baseline detection methodologies for the extent of the impact of its inherent complexity to be evaluated. Finally, the designed system is examined in comparison to a state-of-the-art detection technique operating on one of the three datasets used in this thesis, achieving higher or similar detection performance on all the scenarios considered.