🤖 AI Summary
Conventional image-based visual servoing (IBVS) for UAVs relies heavily on artificial markers and suffers from poor robustness against occlusion, illumination variations, and cluttered backgrounds. To address these limitations, this paper proposes a marker-free, deep learning–driven monocular IBVS framework. Our key innovation is the first integration of unsupervised CNN-based keypoint detection into aerial robotic IBVS, replacing hand-crafted markers with natural scene keypoints. The method jointly leverages learned feature embeddings and a robust IBVS controller designed for pose regulation under visual uncertainty. We validate the approach in a photorealistic ROS/Gazebo simulation environment. Experimental results demonstrate high-precision pose regulation under challenging conditions—including textureless regions, dynamic backgrounds, and partial occlusions—while significantly outperforming conventional marker-based IBVS in robustness and generalizability. This work substantially extends the applicability of perception-guided motion control to realistic UAV operational scenarios.
📝 Abstract
The problem of image-based visual servoing (IBVS) of an aerial robot using deep-learning-based keypoint detection is addressed in this article. A monocular RGB camera mounted on the platform is utilized to collect the visual data. A convolutional neural network (CNN) is then employed to extract the features serving as the visual data for the servoing task. This paper contributes to the field by circumventing not only the challenge stemming from the need for man-made marker detection in conventional visual servoing techniques, but also enhancing the robustness against undesirable factors including occlusion, varying illumination, clutter, and background changes, thereby broadening the applicability of perception-guided motion control tasks in aerial robots. Additionally, extensive physics-based ROS Gazebo simulations are conducted to assess the effectiveness of this method, in contrast to many existing studies that rely solely on physics-less simulations. A demonstration video is available at https://youtu.be/Dd2Her8Ly-E.