Outlier and Anomaly-Handling for 6G Wireless Measurement Data

None, None

Outlier and Anomaly-Handling for 6G Wireless Measurement Data

A Systematic, Downstream-Centric Comparison of Statistical Filters and Unsupervised Outlier Detectors for Tabular and Time-Series 6G Network Measurements

Bachelor Thesis (2026)

Author(s)

M. Stanescu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

R. Hai – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Y. Wang – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J. Urbano Merino – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

KPI 6G Outliers

To reference this document use

https://resolver.tudelft.nl/uuid:3f5bca1f-13fa-45dc-8fb6-023f551e7015

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

26-06-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

31

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Machine-learning-driven management of next-generation (6G) networks depends on measurements that are routinely corrupted by sensor noise, hardware imperfections, bursty interference, and malicious activity, so outlier handling is widely assumed to be a prerequisite for reliable downstream models. Whether, and which, cleaning methods actually help, and whether this differs across data modalities, remains unclear. Using two real, labelled datasets with no synthetic contamination, attack traffic from a functional 5G testbed (attack classification) and an operational web-latency KPI series (short-term forecasting), we systematically compare six outlier-handling methods, namely interpretable statistical filters (robust Z-score replacement, IQR clipping, and Savitzky–Golay smoothing) and unsupervised detectors (Isolation Forest, Local Outlier Factor, and PCA reconstruction), against a no-cleaning baseline. Hyperparameters are tuned without access to held-out labels; each method is evaluated under both a robust (Random Forest) and a noise-sensitive (k-NN) downstream model, with paired significance tests and false-discovery-rate (FDR) correction, a detection diagnostic, and runtime. The result is largely negative: after FDR correction, no method significantly improves downstream performance on either modality. Savitzky–Golay smoothing gives the only suggestive forecasting gain (≈17% lower error under Random Forest) but does not survive correction; deletion- and clipping-based methods are neutral-to-harmful (IQR significantly degrades classification); and the unsupervised detectors rank real attacks barely above chance (ROC-AUC 0.54–0.60), even though a supervised model separates the same classes at 0.86, statistical outlier detection is a poor proxy for the anomalies of interest. As the slowest detectors are also the most harmful and exceed the near-real-time control budget, we conclude that a generic outlier-handling stage offers no reliable benefit for these tasks: its value must be demonstrated rather than assumed, with lightweight smoothing the only candidate worth trying on noisy sequential signals.

Files

PAPERFINAL.pdf

(pdf | 0.637 Mb)

License info not available