Spatial Retrieval Augmented Autonomous Driving

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Current autonomous driving systems rely solely on onboard sensors, rendering them vulnerable to limited field-of-view, occlusions, and adverse weather conditions, while lacking long-term memory of road topology. To address this, we propose a spatial retrieval-augmented paradigm that, for the first time, integrates offline-acquired georeferenced imagery (e.g., Google Maps snapshots) as a plug-and-play modality into a multi-task autonomous driving framework—requiring no additional hardware and enabling “memory-augmented” environmental perception. Our method aligns ego-vehicle trajectories with geographic images to fuse spatial priors, unifying support for object detection, HD mapping, occupancy prediction, end-to-end planning, and generative world modeling. Evaluated on nuScenes, it consistently improves performance across all tasks. We open-source our dataset, code, and benchmark to foster research on retrieval-augmented, spatially aware autonomous driving.

Technology Category

Application Category

📝 Abstract

Existing autonomous driving systems rely on onboard sensors (cameras, LiDAR, IMU, etc) for environmental perception. However, this paradigm is limited by the drive-time perception horizon and often fails under limited view scope, occlusion or extreme conditions such as darkness and rain. In contrast, human drivers are able to recall road structure even under poor visibility. To endow models with this ``recall" ability, we propose the spatial retrieval paradigm, introducing offline retrieved geographic images as an additional input. These images are easy to obtain from offline caches (e.g, Google Maps or stored autonomous driving datasets) without requiring additional sensors, making it a plug-and-play extension for existing AD tasks. For experiments, we first extend the nuScenes dataset with geographic images retrieved via Google Maps APIs and align the new data with ego-vehicle trajectories. We establish baselines across five core autonomous driving tasks: object detection, online mapping, occupancy prediction, end-to-end planning, and generative world modeling. Extensive experiments show that the extended modality could enhance the performance of certain tasks. We will open-source dataset curation code, data, and benchmarks for further study of this new autonomous driving paradigm.

Problem

Research questions and friction points this paper is trying to address.

Enhancing autonomous driving perception with offline geographic images

Addressing sensor limitations in adverse visibility conditions

Extending autonomous systems' recall ability via spatial retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces spatial retrieval with offline geographic images

Extends nuScenes dataset using Google Maps APIs

Enhances autonomous driving tasks via plug-and-play modality

🔎 Similar Papers

No similar papers found.

Authors to Follow