🤖 AI Summary
This work addresses the significant performance degradation of RGB-camera-based kilometer post recognition systems in complex scenarios such as drastic illumination changes, high-speed motion, or adverse weather conditions. To tackle this challenge, the authors present EvMetro5K, the first large-scale, synchronized RGB-event kilometer post dataset, and propose a hypergraph-prompted multimodal fusion framework that effectively integrates event camera data with a pre-trained OCR foundation model to enable adaptive multimodal perception. Experimental results demonstrate that the proposed method substantially improves both accuracy and robustness in kilometer post recognition under extreme conditions—including low-light and high-speed environments—across EvMetro5K and multiple benchmark datasets.
📝 Abstract
Metro trains often operate in highly complex environments, characterized by illumination variations, high-speed motion, and adverse weather conditions. These factors pose significant challenges for visual perception systems, especially those relying solely on conventional RGB cameras. To tackle these difficulties, we explore the integration of event cameras into the perception system, leveraging their advantages in low-light conditions, high-speed scenarios, and low power consumption. Specifically, we focus on Kilometer Marker Recognition (KMR), a critical task for autonomous metro localization under GNSS-denied conditions. In this context, we propose a robust baseline method based on a pre-trained RGB OCR foundation model, enhanced through multi-modal adaptation. Furthermore, we construct the first large-scale RGB-Event dataset, EvMetro5K, containing 5,599 pairs of synchronized RGB-Event samples, split into 4,479 training and 1,120 testing samples. Extensive experiments on EvMetro5K and other widely used benchmarks demonstrate the effectiveness of our approach for KMR. Both the dataset and source code will be released on https://github.com/Event-AHU/EvMetro5K_benchmark