InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts

📅 2025-09-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large-scale indoor 3D scene datasets suffer from limited scale, oversimplified layouts, scarcity of small objects, and severe inter-object collisions. To address these limitations, this work introduces a large-scale, simulation-ready dataset comprising approximately 40,000 diverse scenes spanning 15 scene categories and 288 object classes, synthesized from three complementary sources: real-world scans, procedural generation, and manual design. We propose a multi-source scene fusion framework that preserves fine-grained small objects and resolves layout conflicts via physics-based collision elimination. The resulting scenes are rendered as high-fidelity, interactive simulation replicas. Extensive evaluation on scene layout generation and point-goal navigation demonstrates significant improvements in model generalization and reasoning capabilities within complex, realistic environments. This dataset establishes a high-quality foundational resource for large-scale training of embodied AI systems.

Technology Category

Application Category

📝 Abstract
The advancement of Embodied AI heavily relies on large-scale, simulatable 3D scene datasets characterized by scene diversity and realistic layouts. However, existing datasets typically suffer from limitations in data scale or diversity, sanitized layouts lacking small items, and severe object collisions. To address these shortcomings, we introduce extbf{InternScenes}, a novel large-scale simulatable indoor scene dataset comprising approximately 40,000 diverse scenes by integrating three disparate scene sources, real-world scans, procedurally generated scenes, and designer-created scenes, including 1.96M 3D objects and covering 15 common scene types and 288 object classes. We particularly preserve massive small items in the scenes, resulting in realistic and complex layouts with an average of 41.5 objects per region. Our comprehensive data processing pipeline ensures simulatability by creating real-to-sim replicas for real-world scans, enhances interactivity by incorporating interactive objects into these scenes, and resolves object collisions by physical simulations. We demonstrate the value of InternScenes with two benchmark applications: scene layout generation and point-goal navigation. Both show the new challenges posed by the complex and realistic layouts. More importantly, InternScenes paves the way for scaling up the model training for both tasks, making the generation and navigation in such complex scenes possible. We commit to open-sourcing the data, models, and benchmarks to benefit the whole community.
Problem

Research questions and friction points this paper is trying to address.

Creating large-scale simulatable indoor scenes with realistic layouts
Addressing limitations in scene diversity and object collisions
Integrating multiple scene sources for enhanced complexity and interactivity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates three disparate scene sources
Ensures simulatability via real-to-sim replicas
Resolves object collisions through physical simulations
🔎 Similar Papers
No similar papers found.
W
Weipeng Zhong
Shanghai Artificial Intelligence Laboratory, Shanghai Jiao Tong University
P
Peizhou Cao
Shanghai Artificial Intelligence Laboratory, Beihang University
Y
Yichen Jin
Shanghai Artificial Intelligence Laboratory
L
Li Luo
Shanghai Artificial Intelligence Laboratory
Wenzhe Cai
Wenzhe Cai
Shanghai AI Laboratory
Reinforcement LearningVisual NavigationRobotics
J
Jingli Lin
Shanghai Artificial Intelligence Laboratory, Shanghai Jiao Tong University
H
Hanqing Wang
Shanghai Artificial Intelligence Laboratory
Zhaoyang Lyu
Zhaoyang Lyu
PhD of Information Engineering, The Chinese University of Hong Kong
machine learning
Tai Wang
Tai Wang
Shanghai AI Laboratory
Computer Vision3D VisionEmbodied AIDeep Learning
B
Bo Dai
The University of Hong Kong
X
Xudong Xu
Shanghai Artificial Intelligence Laboratory
J
Jiangmiao Pang
Shanghai Artificial Intelligence Laboratory