🤖 AI Summary
Existing VIO/SLAM datasets lack systematic coverage of key challenges inherent to real-world head-mounted device usage—such as high-frequency head motion, dynamic occlusions, low-texture environments, abrupt illumination changes, sensor saturation, and long-term drift—thereby hindering robustness improvements. To address this, we introduce the first large-scale, open-source dataset specifically designed for head-mounted visual-inertial tracking. It comprises synchronized RGB images and IMU measurements from multiple VR headsets operating in diverse, complex environments, featuring precise spatiotemporal alignment, comprehensive modeling of realistic disturbances, and continuous sequences lasting several minutes. Crucially, it is the first dataset to systematically integrate high-dynamic motion, dynamic occlusions, and sensor saturation—thus filling a critical gap in long-duration, high-interference VIO/SLAM evaluation benchmarks. Released under the CC BY 4.0 license, it significantly advances the generalization capability and real-world deployment reliability of localization and mapping algorithms in embodied AI and mixed reality.
📝 Abstract
Humanoid robots and mixed reality headsets benefit from the use of head-mounted sensors for tracking. While advancements in visual-inertial odometry (VIO) and simultaneous localization and mapping (SLAM) have produced new and high-quality state-of-the-art tracking systems, we show that these are still unable to gracefully handle many of the challenging settings presented in the head-mounted use cases. Common scenarios like high-intensity motions, dynamic occlusions, long tracking sessions, low-textured areas, adverse lighting conditions, saturation of sensors, to name a few, continue to be covered poorly by existing datasets in the literature. In this way, systems may inadvertently overlook these essential real-world issues. To address this, we present the Monado SLAM dataset, a set of real sequences taken from multiple virtual reality headsets. We release the dataset under a permissive CC BY 4.0 license, to drive advancements in VIO/SLAM research and development.