π€ AI Summary
To address the scarcity of expert demonstrations in autonomous driving imitation learning, this work proposes modeling trajectories of surrounding vehicles as reusable βimplicit expert demonstrations,β extending beyond conventional reliance solely on ego-vehicle trajectories. Methodologically, we introduce a dynamic sampling strategy grounded in behavioral informativeness and trajectory diversity to efficiently select and augment multi-agent trajectory data; this is integrated within the PLUTO learnable planning framework and evaluated on the nuPlan simulation platform. Results demonstrate that training with only 10% of the original dataset achieves performance comparable to full-data training, while significantly reducing collision rates and consistently outperforming baseline methods across all safety metrics. To our knowledge, this is the first systematic effort to incorporate trajectories of surrounding traffic participants into the imitation learning paradigm, offering a novel pathway toward sample-efficient and safety-critical autonomous decision-making.
π Abstract
Imitation learning is a promising approach for training autonomous vehicles (AV) to navigate complex traffic environments by mimicking expert driver behaviors. However, a major challenge in this paradigm lies in effectively utilizing available driving data, as collecting new data is resource-intensive and often limited in its ability to cover diverse driving scenarios. While existing imitation learning frameworks focus on leveraging expert demonstrations, they often overlook the potential of additional complex driving data from surrounding traffic participants. In this paper, we propose a data augmentation strategy that enhances imitation learning by leveraging the observed trajectories of nearby vehicles, captured through the AV's sensors, as additional expert demonstrations. We introduce a vehicle selection sampling strategy that prioritizes informative and diverse driving behaviors, contributing to a richer and more diverse dataset for training. We evaluate our approach using the state-of-the-art learning-based planning method PLUTO on the nuPlan dataset and demonstrate that our augmentation method leads to improved performance in complex driving scenarios. Specifically, our method reduces collision rates and improves safety metrics compared to the baseline. Notably, even when using only 10% of the original dataset, our method achieves performance comparable to that of the full dataset, with improved collision rates. Our findings highlight the importance of leveraging diverse real-world trajectory data in imitation learning and provide insights into data augmentation strategies for autonomous driving.