🤖 AI Summary
To address the poor generalization of monolithic foundation models to diverse local road conditions in large-scale autonomous driving, this paper proposes the Dynamic Local Enhancement (DLE) paradigm. Without modifying the architecture of the base driving planner, DLE enables zero-parameter, online adaptation to real-time regional observations by jointly leveraging location-aware Markov Decision Process (MDP) modeling and a lightweight graph neural network. The method integrates real-time local feature extraction, reinforcement learning policy fine-tuning, and dynamic MDP reconfiguration to significantly improve regional adaptability. Extensive multi-scenario evaluations demonstrate that DLE reduces collision rates by 37% and increases average reward by 22%, while incurring no increase in model size. It consistently outperforms unified monolithic models across all metrics, offering a scalable, parameter-efficient solution for geographically heterogeneous deployment.
📝 Abstract
Current autonomous vehicles operate primarily within limited regions, but there is increasing demand for broader applications. However, as models scale, their limited capacity becomes a significant challenge for adapting to novel scenarios. It is increasingly difficult to improve models for new situations using a single monolithic model. To address this issue, we introduce the concept of dynamically enhancing a basic driving planner with local driving data, without permanently modifying the planner itself. This approach, termed the Dynamically Local-Enhancement (DLE) Planner, aims to improve the scalability of autonomous driving systems without significantly expanding the planner's size. Our approach introduces a position-varying Markov Decision Process formulation coupled with a graph neural network that extracts region-specific driving features from local observation data. The learned features describe the local behavior of the surrounding objects, which is then leveraged to enhance a basic reinforcement learning-based policy. We evaluated our approach in multiple scenarios and compared it with a one-for-all driving model. The results show that our method outperforms the baseline policy in both safety (collision rate) and average reward, while maintaining a lighter scale. This approach has the potential to benefit large-scale autonomous vehicles without the need for largely expanding on-device driving models.