Beyond Real Traffic

None, None

Beyond Real Traffic

Assessing the Reliability of AI-Generated Network Data in Deep Learning-Based Intrusion Detection Models

Master Thesis (2025)

Author(s)

H.E.J. Bosma (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

K. Liang – Mentor (TU Delft - Cyber Security)

G. Smaragdakis – Graduation committee member (TU Delft - Cyber Security)

J.H.G. Dauwels – Graduation committee member (TU Delft - Signal Processing Systems)

R. Wang – Mentor (TU Delft - Cyber Security)

Faculty

Electrical Engineering, Mathematics and Computer Science

Deep Learning LSTM Network Security IDS LLM Synthetic Data Generation Wasserstein distance

To reference this document use:

https://resolver.tudelft.nl/uuid:2400b8fe-c39e-4bc9-8e84-49b22b63281f

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

16-12-2025

Awarding Institution

Delft University of Technology

Programme

['Electrical Engineering | Embedded Systems']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This thesis investigates how reliably Large Language Model (LLM)-generated data can be used to train deep learning-based Intrusion Detection Systems (IDS) beyond traditional, real-traffic datasets. In the context of a small distributed environmental measurement application, application-layer sensor data (temperature, humidity, and particulate matter) and corresponding HTTP Network Traffic Telemetry (NTT) were collected over one week using Raspberry Pi measurement stations and Zeek. Two Long-Short-Term Memory (LSTM) models were trained: an Application Model (AM) for sensor anomalies and a Network Traffic Model (NTM) for network anomalies, combined in a voting-based IDS that outputs a trust score per source. Using a structured prompting strategy, a publicly available LLM was then employed to generate synthetic AM and NTT datasets. The similarity between real and synthetic data distributions was quantified using the Wasserstein distance, after which two experiment series were conducted: (1) progressively replacing real samples with synthetic ones while keeping training set size fixed, and (2) augmenting the real data with increasing fractions of synthetic samples. Results show that replacing more than roughly 10% of the AM training data degrades detection performance, whereas the NTM remains robust until real data is nearly fully replaced. In contrast, augmenting (rather than replacing) real data preserves, and in some cases modestly improves, IDS performance. Overall, the findings indicate that LLM-generated data can effectively complement—but not fully replace—real measurements when carefully integrated into IDS training pipelines.

Files

Assessing_the_Reliability_of_A... (pdf)

(pdf | 14 Mb)

License info not available