EnvoDat: A Large-Scale Multisensory Dataset for Robotic Spatial Awareness and Semantic Reasoning in Heterogeneous Environments

📅 2024-10-29

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

251K/year

🤖 AI Summary

Current robot autonomy evaluation is hindered by the absence of high-quality, heterogeneous datasets covering challenging environments—such as underground tunnels, natural wilderness, and modern indoor spaces—characterized by multiple degradations, low texture, and high dynamics. To address this, we introduce the first large-scale, multimodal robotic perception dataset explicitly designed for extreme operational scenarios. It systematically encompasses 13 degradation conditions, including zero visibility, dense fog, heavy rain, and diurnal illumination variations. The dataset supports synchronized, strictly spatiotemporally aligned acquisition across ten modalities: LiDAR, RGB, event camera, IMU, depth, thermal, and others. It comprises 26 sequences, 1.9 TB of raw data, 89K fine-grained polygon annotations, and semantic labels for 82 object and terrain classes. We further provide SLAM- and supervised-learning-ready formats alongside a multi-scale annotation toolkit. Empirical evaluation demonstrates substantial improvements in localization robustness and semantic understanding accuracy under low-light, feature-poor, and highly dynamic conditions.

Technology Category

Application Category

📝 Abstract

To ensure the efficiency of robot autonomy under diverse real-world conditions, a high-quality heterogeneous dataset is essential to benchmark the operating algorithms' performance and robustness. Current benchmarks predominantly focus on urban terrains, specifically for on-road autonomous driving, leaving multi-degraded, densely vegetated, dynamic and feature-sparse environments, such as underground tunnels, natural fields, and modern indoor spaces underrepresented. To fill this gap, we introduce EnvoDat, a large-scale, multi-modal dataset collected in diverse environments and conditions, including high illumination, fog, rain, and zero visibility at different times of the day. Overall, EnvoDat contains 26 sequences from 13 scenes, 10 sensing modalities, over 1.9TB of data, and over 89K fine-grained polygon-based annotations for more than 82 object and terrain classes. We post-processed EnvoDat in different formats that support benchmarking SLAM and supervised learning algorithms, and fine-tuning multimodal vision models. With EnvoDat, we contribute to environment-resilient robotic autonomy in areas where the conditions are extremely challenging. The datasets and other relevant resources can be accessed through https://linusnep.github.io/EnvoDat/.

Problem

Research questions and friction points this paper is trying to address.

Robotic spatial awareness in heterogeneous environments

Semantic reasoning for diverse real-world conditions

Benchmarking SLAM and supervised learning algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale multisensory dataset

Diverse environments and conditions

Supports SLAM and supervised learning

🔎 Similar Papers

GND: Global Navigation Dataset with Multi-Modal Perception and Multi-Category Traversability in Outdoor Campus Environments