Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

Poster (2024)
Author(s)

Y. Dong (TU Delft - Transport and Planning)

Xingmin Lu (North China University of Technology)

Ruohan Li (Villanova University)

Wei Song (North China University of Technology)

B Arem (TU Delft - Transport and Planning)

Haneen Farah (TU Delft - Transport and Planning)

Research Group
Transport and Planning
Copyright
© 2024 Y. Dong, Xingmin Lu, Ruohan Li, Wei Song, B. van Arem, H. Farah
More Info
expand_more
Publication Year
2024
Language
English
Copyright
© 2024 Y. Dong, Xingmin Lu, Ruohan Li, Wei Song, B. van Arem, H. Farah
Related content
Research Group
Transport and Planning
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.