CAML-IDS

A framework for the correct assessment of machine learning-based intrusion detection systems

Master Thesis (2019)
Author(s)

M. Vermeer (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

T. Fiebig – Mentor (TU Delft - Information and Communication Technology)

Michel Eeten – Graduation committee member (TU Delft - Organisation & Governance)

Reginald Lagendijk – Coach (TU Delft - Cyber Security)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2019 Mathew Vermeer
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Mathew Vermeer
Graduation Date
08-07-2019
Awarding Institution
Delft University of Technology
Programme
Computer Science | Data Science and Technology
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The Internet is a relatively new technology that the world has become immensely dependent on. It is a tool that makes it possible to simplify our lives and better our society. But as with many things, there are people who with to exploit this tool we have for their own malicious gain. One of the mechanisms that we can use for protection against these malicious actors is the intrusion detection system. Machine learning-based intrusion detection systems (IDS) have been heavily researched for a number of years now. Much of this research, though, appears to be conducted using improper methodologies and incorrect evaluation.
Such methodologies include training and testing IDSs with unrealistic data and using uninformative metrics to determine performance. In this research, we perform a case study using one such IDS. This IDS is trained and evaluated using real network traffic collected from a real-world network. Additionally, we test its performance on actual attack traffic. This research demonstrates that an IDS that is trained with unrealistic data performs nowhere near as well as is claimed by the author when trained using real network traffic. Finally, we propose CAML-IDS, a framework for the correct assessment of machine learning-based intrusion detection systems. This framework can assist future IDS research by preventing incorrect evaluation, in turn preventing the formulation of incorrect research.

Files

License info not available