HALO: High-Altitude Language-Conditioned Monocular Aerial Exploration and Navigation

📅 2025-11-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of real-time dense metric-semantic mapping and autonomous exploration in large-scale outdoor environments under high-altitude monocular vision, this paper proposes the first tightly coupled multimodal framework supporting high-altitude monocular operation. The framework integrates visual-inertial SLAM, monocular depth estimation, real-time semantic segmentation, and natural language understanding to enable onboard end-to-end semantic mapping and language-conditioned trajectory planning. Our approach overcomes three key bottlenecks: low geometric reconstruction accuracy at long ranges, weak semantic perception, and difficulty in multi-task decision-making. In simulation (78,000 m²), our method achieves superior task completion rates and improves path efficiency by 68%. Real-world experiments demonstrate fully autonomous operation over a 24,600 m² area from a 40-meter altitude, significantly enhancing geometric completeness and semantic interpretability in large-scale outdoor scenes.

Technology Category

Application Category

📝 Abstract
We demonstrate real-time high-altitude aerial metric-semantic mapping and exploration using a monocular camera paired with a global positioning system (GPS) and an inertial measurement unit (IMU). Our system, named HALO, addresses two key challenges: (i) real-time dense 3D reconstruction using vision at large distances, and (ii) mapping and exploration of large-scale outdoor environments with accurate scene geometry and semantics. We demonstrate that HALO can plan informative paths that exploit this information to complete missions with multiple tasks specified in natural language. In simulation-based evaluation across large-scale environments of size up to 78,000 sq. m., HALO consistently completes tasks with less exploration time and achieves up to 68% higher competitive ratio in terms of the distance traveled compared to the state-of-the-art semantic exploration baseline. We use real-world experiments on a custom quadrotor platform to demonstrate that (i) all modules can run onboard the robot, and that (ii) in diverse environments HALO can support effective autonomous execution of missions covering up to 24,600 sq. m. area at an altitude of 40 m. Experiment videos and more details can be found on our project page: https://tyuezhan.github.io/halo/.
Problem

Research questions and friction points this paper is trying to address.

Real-time dense 3D reconstruction using vision at large distances
Mapping and exploration of large-scale outdoor environments with accurate geometry
Planning informative paths to complete natural language specified missions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monocular camera with GPS and IMU integration
Real-time dense 3D reconstruction at large distances
Natural language conditioned path planning for missions
🔎 Similar Papers
No similar papers found.