🤖 AI Summary
This work addresses the challenge of robust visual perception for legged robots under high-speed motion and complex lighting conditions, where conventional frame-based cameras suffer from motion blur and degraded performance. To this end, the authors present the first quadrupedal robot dataset integrating multispectral stereo event cameras, frame-based cameras, an IMU, and joint encoders. The dataset comprises over 30 real-world sequences spanning diverse locomotion speeds, spectral wavelengths, and illumination conditions, accompanied by high-precision intrinsic and extrinsic calibration parameters as well as accurate time synchronization. As the first benchmark offering synchronized multispectral, stereo event-based vision and inertial data for quadrupedal robots, it fills a critical gap in multimodal perception under high-speed and visually challenging scenarios, enabling the development and evaluation of algorithms for agile perception, sensor fusion, and semantic segmentation.
📝 Abstract
Agile locomotion in legged robots poses significant challenges for visual perception. Traditional frame-based cameras often fail in these scenarios for producing blurred images, particularly under low-light conditions. In contrast, event cameras capture changes in brightness asynchronously, offering low latency, high temporal resolution, and high dynamic range. These advantages make them suitable for robust perception during rapid motion and under challenging illumination. However, existing event camera datasets exhibit limitations in stereo configurations and multi-band sensing domains under various illumination conditions. To address this gap, we present M-SEVIQ, a multi-band stereo event visual and inertial quadruped dataset collected using a Unitree Go2 equipped with stereo event cameras, a frame-based camera, an inertial measurement unit (IMU), and joint encoders. This dataset contains more than 30 real-world sequences captured across different velocity levels, illumination wavelengths, and lighting conditions. In addition, comprehensive calibration data, including intrinsic, extrinsic, and temporal alignments, are provided to facilitate accurate sensor fusion and benchmarking. Our M-SEVIQ can be used to support research in agile robot perception, sensor fusion, semantic segmentation and multi-modal vision in challenging environments.