🤖 AI Summary
This work addresses the long-standing absence of public benchmarks and standardized evaluation protocols for camera-based perception tasks in railway environments, which has hindered reproducibility and fair comparison of perception algorithms in automated train operations. To bridge this gap, we introduce RAIL-BENCH, the first comprehensive benchmark suite tailored to railway scene understanding, encompassing five core tasks: rail track detection, object detection, vegetation segmentation, multi-object tracking, and monocular visual odometry. Our contributions include a large-scale, real-world multi-scenario dataset, task-specific evaluation metrics—such as LineAP, a novel metric for rail detection that eliminates the need for instance grouping—a unified evaluation protocol, and an online leaderboard. RAIL-BENCH substantially advances the standardization and reproducibility of AI-driven perception research in railway applications.
📝 Abstract
Automated train operation on existing railway infrastructure requires robust camera-based perception, yet the railway domain lacks public benchmark suites with standardized evaluation protocols that would enable reproducible comparison of approaches. We present RAIL-BENCH, the first perception benchmark suite for the railway domain. It comprises five challenges - rail track detection, object detection, vegetation segmentation, multi-object tracking, and monocular visual odometry - each tailored to the specific characteristics of railway environments. RAIL-BENCH provides curated training and test datasets drawn from diverse real-world scenarios, evaluation metrics, and public scoreboards (https://www.mrt.kit.edu/railbench). For the rail track detection challenge we introduce LineAP, a novel segment-based average precision metric that evaluates the geometric accuracy of polyline predictions independently of instance-level grouping, addressing key limitations of existing line detection metrics.