🤖 AI Summary
This work addresses the limitations of conventional event-based SLAM systems, which rely on fixed-frequency processing and struggle to simultaneously achieve low latency and high accuracy. The authors propose an asynchronous visual-inertial SLAM framework that fuses stereo event streams and inertial measurements through a data-driven keypoint detection mechanism, dynamically adapting computational load to enable fully onboard real-time state estimation. This approach achieves, for the first time, large-scale closed-loop autonomous flight on a drone platform using only event cameras, overcoming the bottlenecks of synchronous processing. Experimental results demonstrate that the system attains state-of-the-art accuracy among onboard event-based SLAM methods while maintaining low latency, successfully enabling real-time localization and closed-loop navigation in large-scale environments.
📝 Abstract
The robustness of event cameras to high dynamic range and motion blur holds the potential to improve visual odometry systems in challenging environments. Although their high temporal resolution does not require synchronous processing, most event-based odometry methods still run at fixed rates, which simplifies system design but restricts latency and throughput. In this work, we present AERO-VIS, a stereo event-inertial SLAM system with an integrated, data-driven, robust, and performance-optimized keypoint detector. By processing the event stream asynchronously, the system dynamically adapts to downstream runtime demands, ensuring low-latency and real-time performance. When deploying AERO-VIS on a UAV, we achieve unprecedented accuracy in onboard event-based SLAM. These unique characteristics enable us to present the first purely event-based inertial SLAM system that demonstrates closed-loop UAV control and large-scale state estimation while relying solely on onboard compute. A video of the experiments and the source code are available at ethz-mrl.github.io/AERO-VIS.