An Intrusion Detection System using Graph Neural Networks
T. Hristov (TU Delft - Electrical Engineering, Mathematics and Computer Science)
G. Smaragdakis – Mentor (TU Delft - Cyber Security)
Harm Griffioen – Graduation committee member (TU Delft - Cyber Security)
M. Khosla – Graduation committee member (TU Delft - Multimedia Computing)
Emmanouil Leontaris – Graduation committee member (Fox-IT)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Cybersecurity attacks are increasingly sophisticated, while traditional, rule-based intrusion detectionsystems (IDS) remain prone to high false alert rates. This research explores temporal graph learning fornetwork intrusion detection, introducing a framework that combines temporal graph construction withGraph Attention Networks and recurrent modeling (GATv2 + LSTM). We evaluate on the LANL authentication logs and Zeek logs from the University of West Florida (UWF).
On LANL, our models (Try1/Try2) outperform state-of-the-art baselines for temporal link prediction,achieving high precision and robustness: Accuracy ≈ 0.994, F1 ≈ 0.993, AUC ≈ 0.993–0.998, AP ≈ 0.999.On Zeek data, edge prediction is sensitive to how malicious activity is distributed over time: a simple“Day" shuffling that preserves the temporal structure while also spreading the clusters of attack activity,yields large gains (e.g., Accuracy ≈ 0.969, AUC ≈ 0.996, F1 ≈ 0.959, AP ≈ 0.995), whereas random shufflingharms temporal dependencies and performance.
Extending to edge classification (benign vs. malicious) reveals a key limitation: despite high accuracy,AUC and AP remain low due to a tendency to label nearly all edges as benign under class imbalance andtemporal clustering, producing many false negatives. We test mitigation strategies (dropout, alternativeloss formulations with confidence weighting), which provide a small increase in stability but do notfully resolve the issue.
With our results, we find that the proposed temporal graph method is a strong fit for anomaly detectionvia edge prediction: robust across datasets, resilient to imbalance, and practically applicable. In contrast,edge classification currently lacks reliability for production without improved data balancing, graphconstruction, and training.