π€ AI Summary
To address spatial drift from vision and temporal drift from inertial measurements in visual-inertial odometry (VIO), this paper proposes a high-frame-rate tightly coupled VIO framework implemented on a focal-plane sensor-processor (FPSP) array. The method fuses binary visual features with 400-Hz IMU data and achieves real-time operation at 250 FPS on FPSP hardware. A pixel-level tightly coupled filter, built upon an improved Multi-State Constraint Kalman Filter (MSCKF), enhances pose estimation accuracy and system responsiveness. Experimental evaluation on public benchmarks demonstrates superior performance over ROVIO, VINS-Mono, and ORB-SLAM3 in both localization accuracy and end-to-end latency. These results validate the frameworkβs effectiveness and advancement in high-frequency heterogeneous sensing fusion and edge-deployable real-time VIO.
π Abstract
Vision algorithms can be executed directly on the image sensor when implemented on the next-generation sensors known as focal-plane sensor-processor arrays (FPSP)s, where every pixel has a processor. FPSPs greatly improve latency, reducing the problems associated with the bottleneck of data transfer from a vision sensor to a processor. FPSPs accelerate vision-based algorithms such as visual-inertial odometry (VIO). However, VIO frameworks suffer from spatial drift due to the vision-based pose estimation, whilst temporal drift arises from the inertial measurements. FPSPs circumvent the spatial drift by operating at a high frame rate to match the high-frequency output of the inertial measurements. In this paper, we present TCB-VIO, a tightly-coupled 6 degrees-of-freedom VIO by a Multi-State Constraint Kalman Filter (MSCKF), operating at a high frame-rate of 250 FPS and from IMU measurements obtained at 400 Hz. TCB-VIO outperforms state-of-the-art methods: ROVIO, VINS-Mono, and ORB-SLAM3.