🤖 AI Summary
This work addresses the challenge of semantic object navigation for aerial robots in unknown indoor environments without external localization or global maps. To this end, we propose AION, an end-to-end dual-policy reinforcement learning framework that, for the first time, decouples exploration and goal-reaching behaviors into two specialized policies, enabling efficient and safe 3D navigation using only visual inputs. The method is evaluated in high-fidelity IsaacSim simulations and on the AI2-THOR benchmark, demonstrating significant improvements over existing approaches. AION achieves state-of-the-art performance across key metrics, including exploration coverage, navigation efficiency, and flight safety, highlighting its effectiveness in complex, unstructured indoor settings.
📝 Abstract
Object-Goal Navigation (ObjectNav) requires an agent to autonomously explore an unknown environment and navigate toward target objects specified by a semantic label. While prior work has primarily studied zero-shot ObjectNav under 2D locomotion, extending it to aerial platforms with 3D locomotion capability remains underexplored. Aerial robots offer superior maneuverability and search efficiency, but they also introduce new challenges in spatial perception, dynamic control, and safety assurance. In this paper, we propose AION for vision-based aerial ObjectNav without relying on external localization or global maps. AION is an end-to-end dual-policy reinforcement learning (RL) framework that decouples exploration and goal-reaching behaviors into two specialized policies. We evaluate AION on the AI2-THOR benchmark and further assess its real-time performance in IsaacSim using high-fidelity drone models. Experimental results show that AION achieves superior performance across comprehensive evaluation metrics in exploration, navigation efficiency, and safety. The video can be found at https://youtu.be/TgsUm6bb7zg.