Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

πŸ“… 2023-12-07
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses anomaly detection in digital map lane-rendering images. We propose a four-stage Transformer-based pipeline: data preprocessing β†’ masked image modeling (MiM) self-supervised pretraining β†’ label-smoothed cross-entropy fine-tuning β†’ post-processing. To our knowledge, this is the first application of MiM to this task; it integrates Swin Transformer with uniform masking and introduces a task-specific end-to-end classification architecture and fine-tuning strategy tailored to the structural characteristics of map imagery. On the benchmark dataset, our method achieves 94.77% accuracy (+0.76%) and an AUC of 0.9743 (+0.0245), while reducing fine-tuning epochs from 280 to 41β€”yielding nearly 7Γ— improvement in training efficiency. The core contributions are: (i) pioneering the adaptation of MiM for lane-rendering anomaly detection; and (ii) demonstrating substantial gains in both detection accuracy and training efficiency over prior approaches.
πŸ“ Abstract
The burgeoning navigation services using digital maps provide great convenience to drivers. Nevertheless, the presence of anomalies in lane rendering map images occasionally introduces potential hazards, as such anomalies can be misleading to human drivers and consequently contribute to unsafe driving conditions. In response to this concern and to accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy based loss with label smoothing, and post-processing to tackle it leveraging state-of-the-art deep learning techniques, especially those involving Transformer models. Various experiments verify the effectiveness of the proposed pipeline. Results indicate that the proposed pipeline exhibits superior performance in lane rendering image anomaly detection, and notably, the self-supervised pre-training with MiM can greatly enhance the detection accuracy while significantly reducing the total training time. For instance, employing the Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) yielded a heightened accuracy at 94.77% and an improved Area Under The Curve (AUC) score of 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) with an accuracy of 94.01% and an AUC of 0.9498. The fine-tuning epochs were dramatically reduced to 41 from the original 280. In conclusion, the proposed pipeline, with its incorporation of self-supervised pre-training using MiM and other advanced deep learning techniques, emerges as a robust solution for enhancing the accuracy and efficiency of lane rendering image anomaly detection in digital navigation systems.
Problem

Research questions and friction points this paper is trying to address.

Detects anomalies in lane rendering images
Uses Transformer with self-supervised pre-training
Enhances accuracy and reduces training time
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer with self-supervised pre-training
Customized fine-tuning using cross-entropy
Masked image modeling for anomaly detection
πŸ”Ž Similar Papers
No similar papers found.
Yongqi Dong
Yongqi Dong
RWTH Aachen; TU Delft; UC Berkeley
AIITSAutomated DrivingShared & Smart MobilityBig Data & Interdisciplinary Study
X
Xingmin Lu
School of Electrical and Control Engineering, North China University of Technology, Beijing, China, 100144
R
Ruohan Li
Department of Civil and Environmental Engineering, College of Engineering, Villanova University, Villanova, USA, PA 19085
W
Wei Song
School of Information Science and Technology, North China University of Technology, Beijing, China, 100144
B
B. Arem
Faculty of Civil Engineering and Geosciences, Delft University of Technology, Delft, The Netherlands, 2628 CN
Haneen Farah
Haneen Farah
TU Delft
Traffic SafetyRoad Infrastructure DesignRoad User BehaviourIntelligent Transportation Systems