🤖 AI Summary
Real-time robotic control demands causal pose estimation—relying solely on past and current observations—yet visual SLAM violates causality via non-causal loop closure optimization, and visual-inertial odometry (VIO) suffers from unbounded drift. This paper introduces the first causal visual-inertial localization framework: a tightly coupled multi-camera–multi-map architecture that achieves bounded drift through online map selection, cross-map constraint propagation, IMU preintegration, and joint keyframe feature optimization. We establish the first comprehensive evaluation framework for causal localization, including a formal causal error model and a real-time relocalization mechanism. Evaluated on a newly collected long-term campus dataset and public benchmarks, our method guarantees strictly bounded localization error, improves accuracy by 37%, and operates in real time. The system implementation and dataset are publicly released.
📝 Abstract
Robot control loops require causal pose estimates that depend only on past and present measurements. At each timestep, controllers compute commands using the current pose without waiting for future refinements. While traditional visual SLAM systems achieve high accuracy through retrospective loop closures, these corrections arrive after control decisions were already executed, violating causality. Visual-inertial odometry maintains causality but accumulates unbounded drift over time. To address the distinct requirements of robot control, we propose a multi-camera multi-map visual-inertial localization system providing real-time, causal pose estimation with bounded localization error through continuous map constraints. Since standard trajectory metrics evaluate post-processed trajectories, we analyze the error composition of map-based localization systems and propose a set of evaluation metrics suitable for measuring causal localization performance. To validate our system, we design a multi-camera IMU hardware setup and collect a challenging long-term campus dataset featuring diverse illumination and seasonal conditions. Experimental results on public benchmarks and on our own collected dataset demonstrate that our system provides significantly higher real-time localization accuracy compared to other methods. To benefit the community, we have made both the system and the dataset open source at https://anonymous.4open.science/r/Multi-cam-Multi-map-VILO-7993.