🤖 AI Summary
Achieving high-speed autonomous aerial target tracking and obstacle avoidance in complex野外 environments without GPS or prior maps remains challenging. This paper proposes a fully onboard, target-centric navigation framework that relies solely on a stereo camera and IMU, unifying perception, state estimation, and control within the target’s reference frame—eliminating the need for global localization or mapping. The method integrates lightweight target detection, stereo depth completion, histogram filtering, visual-inertial odometry, nonlinear model predictive control, and compact collision-point-set extraction, while generating high-order control barrier functions online to ensure real-time safety-critical obstacle avoidance. Experimental results demonstrate stable target-following flight at speeds exceeding 50 km/h across diverse scenarios—including urban mazes and forest trails—with robustness against severe illumination variations and frequent GPS outages.
📝 Abstract
Autonomous aerial target tracking in unstructured and GPS-denied environments remains a fundamental challenge in robotics. Many existing methods rely on motion capture systems, pre-mapped scenes, or feature-based localization to ensure safety and control, limiting their deployment in real-world conditions. We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation using only a stereo camera and an IMU. Rather than constructing a global map or relying on absolute localization, NOVA formulates perception, estimation, and control entirely in the target's reference frame. A tightly integrated stack combines a lightweight object detector with stereo depth completion, followed by histogram-based filtering to infer robust target distances under occlusion and noise. These measurements feed a visual-inertial state estimator that recovers the full 6-DoF pose of the robot relative to the target. A nonlinear model predictive controller (NMPC) plans dynamically feasible trajectories in the target frame. To ensure safety, high-order control barrier functions are constructed online from a compact set of high-risk collision points extracted from depth, enabling real-time obstacle avoidance without maps or dense representations. We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss and severe lighting changes that disrupt feature-based localization. Each experiment is repeated multiple times under similar conditions to assess resilience, showing consistent and reliable performance. NOVA achieves agile target following at speeds exceeding 50 km/h. These results show that high-speed vision-based tracking is possible in the wild using only onboard sensing, with no reliance on external localization or environment assumptions.