🤖 AI Summary
To address the challenge of real-time, accurate heading estimation for unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) operating collaboratively under GPS/GNSS-denied conditions, this paper proposes a purely vision-based, end-to-end data-driven framework. The method uniquely integrates a lightweight artificial neural network (ANN) with a fine-tuned YOLOv5 detector to directly regress the relative heading angle of UGVs from monocular images—requiring no external positioning systems and relying solely on an onboard monocular camera. It enables dynamic, multi-agent online coordination. Evaluated on the VICON-calibrated dataset, the approach achieves 95% UGV detection accuracy and attains a mean absolute error of 0.1506° and a root-mean-square error of 0.1957° in heading estimation—meeting stringent real-time and stability requirements. This work establishes an efficient, lightweight, and deployable paradigm for vision-only collaborative navigation in GNSS-denied environments.
📝 Abstract
The integration of Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is increasingly central to the development of intelligent autonomous systems for applications such as search and rescue, environmental monitoring, and logistics. However, precise coordination between these platforms in real-time scenarios presents major challenges, particularly when external localization infrastructure such as GPS or GNSS is unavailable or degraded [1]. This paper proposes a vision-based, data-driven framework for real-time UAV-UGV integration, with a focus on robust UGV detection and heading angle prediction for navigation and coordination. The system employs a fine-tuned YOLOv5 model to detect UGVs and extract bounding box features, which are then used by a lightweight artificial neural network (ANN) to estimate the UAV's required heading angle. A VICON motion capture system was used to generate ground-truth data during training, resulting in a dataset of over 13,000 annotated images collected in a controlled lab environment. The trained ANN achieves a mean absolute error of 0.1506° and a root mean squared error of 0.1957°, offering accurate heading angle predictions using only monocular camera inputs. Experimental evaluations achieve 95% accuracy in UGV detection. This work contributes a vision-based, infrastructure- independent solution that demonstrates strong potential for deployment in GPS/GNSS-denied environments, supporting reliable multi-agent coordination under realistic dynamic conditions. A demonstration video showcasing the system's real-time performance, including UGV detection, heading angle prediction, and UAV alignment under dynamic conditions, is available at: https://github.com/Kooroshraf/UAV-UGV-Integration