🤖 AI Summary
Event cameras offer high dynamic range and microsecond temporal resolution, but their inherently noisy event streams impede reliable extraction of high-quality 3D map points, severely limiting visual-inertial odometry (VIO) accuracy. To address this, we propose Voxelized Stereo-Visual-Inertial Odometry (V-SVIO), a tightly coupled framework integrating stereo vision, event cameras, and IMU. Our key innovation lies in modeling map point management as a confidence-aware selection and dynamic update problem within a voxelized 3D space: within each voxel, candidate 3D points are jointly evaluated and selected based on geometric consistency, event response strength, and motion observability, enabling incremental outlier rejection and reprojection. This design significantly enhances state estimation robustness and precision. Evaluated on EVO, HPatches, and MVSEC benchmarks, V-SVIO achieves state-of-the-art pose estimation accuracy with lower computational overhead, reducing average pose error by 18.7% compared to prior methods.
📝 Abstract
The event camera, renowned for its high dynamic range and exceptional temporal resolution, is recognized as an important sensor for visual odometry. However, the inherent noise in event streams complicates the selection of high-quality map points, which critically determine the precision of state estimation. To address this challenge, we propose Voxel-ESVIO, an event-based stereo visual-inertial odometry system that utilizes voxel map management, which efficiently filter out high-quality 3D points. Specifically, our methodology utilizes voxel-based point selection and voxel-aware point management to collectively optimize the selection and updating of map points on a per-voxel basis. These synergistic strategies enable the efficient retrieval of noise-resilient map points with the highest observation likelihood in current frames, thereby ensureing the state estimation accuracy. Extensive evaluations on three public benchmarks demonstrate that our Voxel-ESVIO outperforms state-of-the-art methods in both accuracy and computational efficiency.