The burgeoning navigation services using digital maps provide great convenience to drivers. Nevertheless, the presence of anomalies in lane-rendering map images occasionally introduces potential hazards, as such anomalies can mislead human drivers and consequently contribute to u
...
The burgeoning navigation services using digital maps provide great convenience to drivers. Nevertheless, the presence of anomalies in lane-rendering map images occasionally introduces potential hazards, as such anomalies can mislead human drivers and consequently contribute to unsafe driving. In response to this concern to accurately and effectively detect the anomalies, this paper transforms lane-rendering image anomaly detection into a classification problem and proposes a four-phase pipeline: data preprocessing, self-supervised pretraining with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy-based loss with label smoothing, and post-processing. Leveraging state-of-the-art deep learning techniques, especially those involving transformer models, the pipeline demonstrates superior performance verified through various experiments. Notably, self-supervised pretraining with MiM can greatly enhance detection accuracy while significantly reducing the total training time. For instance, employing the Swin Transformer with Uniform Masking as self-supervised pretraining yielded a higher accuracy of 94.77% and an improved area under the curve (AUC) score of 0.9743 compared with the pure Swin Transformer without pretraining with an accuracy of 94.01% and an AUC of 0.9498. Furthermore, fine-tuning epochs were dramatically reduced to 41 from the original 280. Ablation study with regard to techniques to alleviate the data imbalance between normal and abnormal instances further reinforces the model’s overall performance. In conclusion, the proposed pipeline, with its incorporation of self-supervised pretraining using MiM and other advanced deep learning techniques, emerges as a robust solution for enhancing the accuracy and efficiency of lane-rendering image anomaly detection in digital navigation systems.