VisLanding: Monocular 3D Perception for UAV Safe Landing via Depth-Normal Synergy

📅 2025-06-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of safe, autonomous UAV landing in complex and unknown environments, this paper proposes a monocular 3D perception framework that formulates Safe Landing Zone (SLZ) estimation as an end-to-end binary semantic segmentation task. Methodologically, it introduces a novel joint regression mechanism for depth and surface normal vectors, integrates a fine-tuned Metric3D V2 backbone, and designs a dedicated SLZ segmentation head. We further construct WildUAV, a new dataset featuring drone-view annotations and cross-domain evaluation subsets. Key contributions include: (1) the first monocular-image-based framework enabling joint quantitative estimation of SLZ location and area; (2) zero-shot cross-domain generalization capability, significantly improving robustness and segmentation accuracy; and (3) real-time performance and operational reliability validated on an actual UAV decision-making system.

Technology Category

Application Category

📝 Abstract
This paper presents VisLanding, a monocular 3D perception-based framework for safe UAV (Unmanned Aerial Vehicle) landing. Addressing the core challenge of autonomous UAV landing in complex and unknown environments, this study innovatively leverages the depth-normal synergy prediction capabilities of the Metric3D V2 model to construct an end-to-end safe landing zones (SLZ) estimation framework. By introducing a safe zone segmentation branch, we transform the landing zone estimation task into a binary semantic segmentation problem. The model is fine-tuned and annotated using the WildUAV dataset from a UAV perspective, while a cross-domain evaluation dataset is constructed to validate the model's robustness. Experimental results demonstrate that VisLanding significantly enhances the accuracy of safe zone identification through a depth-normal joint optimization mechanism, while retaining the zero-shot generalization advantages of Metric3D V2. The proposed method exhibits superior generalization and robustness in cross-domain testing compared to other approaches. Furthermore, it enables the estimation of landing zone area by integrating predicted depth and normal information, providing critical decision-making support for practical applications.
Problem

Research questions and friction points this paper is trying to address.

Monocular 3D perception for UAV safe landing
Depth-normal synergy for safe zone identification
Cross-domain robustness in complex environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monocular 3D perception for UAV landing
Depth-normal synergy prediction via Metric3D V2
Safe zone segmentation as binary classification
🔎 Similar Papers
No similar papers found.
Z
Zhuoyue Tan
Institute of Artificial Intelligence, Xiamen University
Boyong He
Boyong He
Xiamen University
CV
Yuxiang Ji
Yuxiang Ji
Xiamen University
L
Liaoni Wu
Institute of Artificial Intelligence, Xiamen University, School of Aerospace Engineering, Xiamen University