AeroScene: Progressive Scene Synthesis for Aerial Robotics

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing drone simulation frameworks, which rely on manually crafted 3D environments that are difficult to scale and often lack physical plausibility and semantic consistency. To overcome these challenges, we propose the first hierarchical diffusion generative model tailored for aerial robotics tasks. By integrating hierarchy-aware tokenization with multi-branch feature extraction, our approach jointly models global scene layout and local geometric details, enabling progressive 3D scene synthesis. Notably, we introduce a hierarchical diffusion mechanism into drone-centric scene generation and couple it with a physics engine to ensure the generated environments are physically valid and directly usable for downstream tasks such as navigation and landing. Experiments on both a newly curated dataset and established benchmarks demonstrate that our method substantially outperforms existing approaches, successfully generating over 1,000 high-fidelity, physics-ready 3D scenes and significantly enhancing drone navigation performance.

Technology Category

Application Category

📝 Abstract
Generative models have shown substantial impact across multiple domains, their potential for scene synthesis remains underexplored in robotics. This gap is more evident in drone simulators, where simulation environments still rely heavily on manual efforts, which are time-consuming to create and difficult to scale. In this work, we introduce AeroScene, a hierarchical diffusion model for progressive 3D scene synthesis. Our approach leverages hierarchy-aware tokenization and multi-branch feature extraction to reason across both global layouts and local details, ensuring physical plausibility and semantic consistency. This makes AeroScene particularly suited for generating realistic scenes for aerial robotics tasks such as navigation, landing, and perching. We demonstrate its effectiveness through extensive experiments on our newly collected dataset and a public benchmark, showing that AeroScene significantly outperforms prior methods. Furthermore, we use AeroScene to generate a large-scale dataset of over 1,000 physics-ready, high fidelity 3D scenes that can be directly integrated into NVIDIA Isaac Sim. Finally, we illustrate the utility of these generated environments on downstream drone navigation tasks. Our code and dataset are publicly available at aioz-ai.github.io/AeroScene/
Problem

Research questions and friction points this paper is trying to address.

scene synthesis
aerial robotics
drone simulation
3D environment generation
generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical diffusion model
scene synthesis
aerial robotics
hierarchy-aware tokenization
physics-ready 3D scenes
🔎 Similar Papers
No similar papers found.
N
Nghia Vu
AIOZ Ltd., Singapore
T
Tuong Do
University of Liverpool, UK; AIOZ Ltd., Singapore; National Tsing Hua University, Taiwan
D
Dzung Tran
RMIT University, Vietnam Campus
Binh X. Nguyen
Binh X. Nguyen
AI Researcher at AIOZ
Computer ScienceComputer VisionMachine Learning
H
Hoan Nguyen
University of Information Technology, VNUHCM, Vietnam
E
Erman Tjiputra
AIOZ Ltd., Singapore
Quang D. Tran
Quang D. Tran
Research Scientist, University of Liverpool
Machine LearningComputer VisionRoboticsFederated LearningData Science
H
Hai-Nguyen Nguyen
RMIT University, Vietnam Campus
Anh Nguyen
Anh Nguyen
University of Liverpool
Robotic VisionMachine LearningRobotics