Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

More Info
expand_more

Abstract

The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.