🤖 AI Summary
To address the high computational cost of conventional generative models in end-to-end autonomous driving, as well as the difficulty in jointly achieving multimodal planning capability and generalization to long-tail scenarios, this paper proposes an anchor-guided hybrid diffusion trajectory generation framework. Our method introduces hybrid trajectory anchors that jointly encode static driving priors and dynamic contextual awareness to constrain the diffusion process, enabling efficient, fine-grained multimodal trajectory generation. We further design a Transformer-based fusion architecture that integrates dense and sparse feature extraction with an anchor offset prediction mechanism. Evaluated on the NAVSIM benchmark, our approach achieves state-of-the-art performance, reduces inference latency by 42% over baseline methods, and significantly improves generalization to long-tail driving scenarios—demonstrating strong practical deployability.
📝 Abstract
End-to-end multi-modal planning has become a transformative paradigm in autonomous driving, effectively addressing behavioral multi-modality and the generalization challenge in long-tail scenarios. We propose AnchDrive, a framework for end-to-end driving that effectively bootstraps a diffusion policy to mitigate the high computational cost of traditional generative models. Rather than denoising from pure noise, AnchDrive initializes its planner with a rich set of hybrid trajectory anchors. These anchors are derived from two complementary sources: a static vocabulary of general driving priors and a set of dynamic, context-aware trajectories. The dynamic trajectories are decoded in real-time by a Transformer that processes dense and sparse perceptual features. The diffusion model then learns to refine these anchors by predicting a distribution of trajectory offsets, enabling fine-grained refinement. This anchor-based bootstrapping design allows for efficient generation of diverse, high-quality trajectories. Experiments on the NAVSIM benchmark confirm that AnchDrive sets a new state-of-the-art and shows strong gen?eralizability