EnvoDat: A Large-Scale Multisensory Dataset for Robotic Spatial Awareness and Semantic Reasoning in Heterogeneous Environments

📅 2024-10-29
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current robot autonomy evaluation is hindered by the absence of high-quality, heterogeneous datasets covering challenging environments—such as underground tunnels, natural wilderness, and modern indoor spaces—characterized by multiple degradations, low texture, and high dynamics. To address this, we introduce the first large-scale, multimodal robotic perception dataset explicitly designed for extreme operational scenarios. It systematically encompasses 13 degradation conditions, including zero visibility, dense fog, heavy rain, and diurnal illumination variations. The dataset supports synchronized, strictly spatiotemporally aligned acquisition across ten modalities: LiDAR, RGB, event camera, IMU, depth, thermal, and others. It comprises 26 sequences, 1.9 TB of raw data, 89K fine-grained polygon annotations, and semantic labels for 82 object and terrain classes. We further provide SLAM- and supervised-learning-ready formats alongside a multi-scale annotation toolkit. Empirical evaluation demonstrates substantial improvements in localization robustness and semantic understanding accuracy under low-light, feature-poor, and highly dynamic conditions.

Technology Category

Application Category

📝 Abstract
To ensure the efficiency of robot autonomy under diverse real-world conditions, a high-quality heterogeneous dataset is essential to benchmark the operating algorithms' performance and robustness. Current benchmarks predominantly focus on urban terrains, specifically for on-road autonomous driving, leaving multi-degraded, densely vegetated, dynamic and feature-sparse environments, such as underground tunnels, natural fields, and modern indoor spaces underrepresented. To fill this gap, we introduce EnvoDat, a large-scale, multi-modal dataset collected in diverse environments and conditions, including high illumination, fog, rain, and zero visibility at different times of the day. Overall, EnvoDat contains 26 sequences from 13 scenes, 10 sensing modalities, over 1.9TB of data, and over 89K fine-grained polygon-based annotations for more than 82 object and terrain classes. We post-processed EnvoDat in different formats that support benchmarking SLAM and supervised learning algorithms, and fine-tuning multimodal vision models. With EnvoDat, we contribute to environment-resilient robotic autonomy in areas where the conditions are extremely challenging. The datasets and other relevant resources can be accessed through https://linusnep.github.io/EnvoDat/.
Problem

Research questions and friction points this paper is trying to address.

Robotic spatial awareness in heterogeneous environments
Semantic reasoning for diverse real-world conditions
Benchmarking SLAM and supervised learning algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale multisensory dataset
Diverse environments and conditions
Supports SLAM and supervised learning
🔎 Similar Papers
No similar papers found.
L
Linus Nwankwo
Chair of Cyber-Physical System, Montanuniversitaet Leoben, Austria
B
Bjoern Ellensohn
Chair of Cyber-Physical System, Montanuniversitaet Leoben, Austria
V
Vedant Dave
Chair of Cyber-Physical System, Montanuniversitaet Leoben, Austria
Peter Hofer
Peter Hofer
Oracle Labs
J
Jan Forstner
Chair of Subsurface Engineering, Montanuniversitaet Leoben, Austria
M
Marlene Villneuve
Chair of Subsurface Engineering, Montanuniversitaet Leoben, Austria
R
Robert Galler
Chair of Subsurface Engineering, Montanuniversitaet Leoben, Austria
E
Elmar Rueckert
Chair of Cyber-Physical System, Montanuniversitaet Leoben, Austria