UAV See, UGV Do: Aerial Imagery and Virtual Teach Enabling Zero-Shot Ground Vehicle Repeat

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of zero-shot autonomous navigation for unmanned ground vehicles (UGVs) in unknown environments under GPS-denied conditions, this paper proposes a novel “aerial-observation-to-ground-execution” paradigm. It leverages UAV-captured aerial imagery to reconstruct a neural radiance field (NeRF) 3D scene and generate a high-fidelity teach map. A virtual trajectory is then planned within this NeRF map, and sim-to-real point-cloud registration and closed-loop tracking are achieved by integrating NeRF-derived point-cloud submaps into a LiDAR Teach-and-Repeat (LT&R) framework. This work marks the first direct use of NeRF reconstruction as the navigation map for LT&R, eliminating the need for manual on-site demonstration. In real-world experiments over a 12-km route, the method achieves path-tracking RMSEs of 19.5 cm and 18.4 cm—both below one tire width—with a maximum error of ≤47.6 cm, matching the performance of conventional, manually taught LT&R systems.

Technology Category

Application Category

📝 Abstract
This paper presents Virtual Teach and Repeat (VirT&R): an extension of the Teach and Repeat (T&R) framework that enables GPS-denied, zero-shot autonomous ground vehicle navigation in untraversed environments. VirT&R leverages aerial imagery captured for a target environment to train a Neural Radiance Field (NeRF) model so that dense point clouds and photo-textured meshes can be extracted. The NeRF mesh is used to create a high-fidelity simulation of the environment for piloting an unmanned ground vehicle (UGV) to virtually define a desired path. The mission can then be executed in the actual target environment by using NeRF-derived point cloud submaps associated along the path and an existing LiDAR Teach and Repeat (LT&R) framework. We benchmark the repeatability of VirT&R on over 12 km of autonomous driving data using physical markings that allow a sim-to-real lateral path-tracking error to be obtained and compared with LT&R. VirT&R achieved measured root mean squared errors (RMSE) of 19.5 cm and 18.4 cm in two different environments, which are slightly less than one tire width (24 cm) on the robot used for testing, and respective maximum errors were 39.4 cm and 47.6 cm. This was done using only the NeRF-derived teach map, demonstrating that VirT&R has similar closed-loop path-tracking performance to LT&R but does not require a human to manually teach the path to the UGV in the actual environment.
Problem

Research questions and friction points this paper is trying to address.

Enables GPS-denied autonomous ground vehicle navigation
Uses aerial imagery to train NeRF for environment simulation
Achieves sim-to-real path-tracking without manual teaching
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses aerial imagery for NeRF model training
Creates high-fidelity simulation for UGV path planning
Leverages NeRF-derived point clouds for autonomous navigation
🔎 Similar Papers
No similar papers found.
D
Desiree Fisker
University of Toronto Institute of Aerospace Studies, 4925 Dufferin St, North York, ON M3H 5T6, Canada
Alexander Krawciw
Alexander Krawciw
PhD Student, University of Toronto
field roboticsmachine learning
Sven Lilge
Sven Lilge
Robotics Institute, University of Toronto
Melissa Greeff
Melissa Greeff
Queen's University
Safe Learning-Based ControlAerial RoboticsVision-Based Navigation
T
Timothy D. Barfoot
University of Toronto Institute of Aerospace Studies, 4925 Dufferin St, North York, ON M3H 5T6, Canada