๐ค AI Summary
This work addresses the challenge of scalable modeling for future driving state prediction in autonomous driving. We propose an autoregressive behavioral model based on large-scale Transformers, framing driving as a sequential decision-making task and performing autoregressive prediction of multi-agent future trajectories tokenized by state representations. To our knowledge, this is the first systematic study of the co-scaling laws among data volume, model parameter count, and computational resources. The model enables end-to-end joint optimization of planning and prediction. It is pre-trained on large-scale multimodal driving datasets and evaluated in closed-loop simulation. Experiments demonstrate robust closed-loop driving performance in complex real-world scenarios, significant improvements over state-of-the-art methods in prediction accuracy, and consistent performance gains with increasing data and model scaleโempirically validating the critical benefits of data-driven scaling for behavioral modeling.
๐ Abstract
We present DriveGPT, a scalable behavior model for autonomous driving. We model driving as a sequential decision-making task, and learn a transformer model to predict future agent states as tokens in an autoregressive fashion. We scale up our model parameters and training data by multiple orders of magnitude, enabling us to explore the scaling properties in terms of dataset size, model parameters, and compute. We evaluate DriveGPT across different scales in a planning task, through both quantitative metrics and qualitative examples, including closed-loop driving in complex real-world scenarios. In a separate prediction task, DriveGPT outperforms state-of-the-art baselines and exhibits improved performance by pretraining on a large-scale dataset, further validating the benefits of data scaling.