Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pretraining and Customized Fine-Tuning

None, None; None, None; None, None; None, None; None, None; None, None

Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pretraining and Customized Fine-Tuning

Journal Article (2025)

Author(s)

Yongqi Dong (TU Delft - Traffic Systems Engineering, RWTH Aachen University)

Xingmin Lu (North China University of Technology)

Ruohan Li (Villanova University)

Wei Song (North China University of Technology)

B van Van Arem (TU Delft - Transport, Mobility and Logistics)

Haneen Farah (TU Delft - Traffic Systems Engineering)

Research Group

Traffic Systems Engineering

DOI related publication

https://doi.org/10.1177/03611981251333341

Image reconstruction Transformer Self-supervised learning Anomaly detection Image classification Lane rendering image Masked image modeling

To reference this document use:

https://resolver.tudelft.nl/uuid:c6d58cf1-472e-4d49-90e7-67be80ad1b7a

More Info

expand_more

Publication Year

2025

Language

English

Research Group

Traffic Systems Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The burgeoning navigation services using digital maps provide great convenience to drivers. Nevertheless, the presence of anomalies in lane-rendering map images occasionally introduces potential hazards, as such anomalies can mislead human drivers and consequently contribute to unsafe driving. In response to this concern to accurately and effectively detect the anomalies, this paper transforms lane-rendering image anomaly detection into a classification problem and proposes a four-phase pipeline: data preprocessing, self-supervised pretraining with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy-based loss with label smoothing, and post-processing. Leveraging state-of-the-art deep learning techniques, especially those involving transformer models, the pipeline demonstrates superior performance verified through various experiments. Notably, self-supervised pretraining with MiM can greatly enhance detection accuracy while significantly reducing the total training time. For instance, employing the Swin Transformer with Uniform Masking as self-supervised pretraining yielded a higher accuracy of 94.77% and an improved area under the curve (AUC) score of 0.9743 compared with the pure Swin Transformer without pretraining with an accuracy of 94.01% and an AUC of 0.9498. Furthermore, fine-tuning epochs were dramatically reduced to 41 from the original 280. Ablation study with regard to techniques to alleviate the data imbalance between normal and abnormal instances further reinforces the model’s overall performance. In conclusion, the proposed pipeline, with its incorporation of self-supervised pretraining using MiM and other advanced deep learning techniques, emerges as a robust solution for enhancing the accuracy and efficiency of lane-rendering image anomaly detection in digital navigation systems.

Files

Dong-et-al-2025-intelligent-an... (pdf)

(pdf | 3.97 Mb)