🤖 AI Summary
Fixed prompts cause open-vocabulary segmentation models—particularly CLIPSeg—to fail in dynamic environments, jeopardizing safe drone landing. Method: We propose a dynamic prompt engineering automation framework for CLIPSeg, integrating environment-aware feedback, prompt optimization search, and the DOVESEI architecture to enable monocular-image-driven, adaptive generation and iterative refinement of semantic prompts—overcoming limitations of manual prompt design. Contribution/Results: Our end-to-end vision-language segmentation system robustly handles distribution shifts, achieving >30% improvement in safe landing zone identification accuracy over standard CLIP/CLIPSeg in both simulation and indoor experiments. This work is the first to systematically introduce dynamic prompt engineering into open-vocabulary segmentation, establishing a novel paradigm for robust semantic understanding by embodied agents in open-world settings.
📝 Abstract
Safe landing is an essential aspect of flight operations in fields ranging from industrial to space robotics. With the growing interest in artificial intelligence, we focus on learning-based methods for safe landing. Our previous work, Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI), demonstrated the feasibility of using prompt-based segmentation for identifying safe landing zones with open vocabulary models. However, relying on a heuristic selection of words for prompts is not reliable, as it cannot adapt to changing environments, potentially leading to harmful outcomes if the observed environment is not accurately represented by the chosen prompt. To address this issue, we introduce PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), an enhancement to DOVESEI that automates prompt engineering to adapt to shifts in data distribution. PEACE can perform safe landings using only monocular cameras and image segmentation. PEACE shows significant improvements in prompt generation and engineering for aerial images compared to standard prompts used for CLIP and CLIPSeg. By combining DOVESEI and PEACE, our system improved the success rate of safe landing zone selection by at least 30% in both simulations and indoor experiments.