Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models

๐Ÿ“… 2024-12-05
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In traffic simulation, multi-agent policies trained via open-loop behavioral cloning suffer from covariate shift when deployed in closed-loop settings, hindering accurate modeling of the joint distribution of real-world trajectories. To address this, we propose a closed-loop supervised fine-tuning framework featuring the novel Closest Among Top-K (CAT-K) strategy. Built upon a tokenized multi-agent architecture, CAT-K performs supervised closed-loop rollouts on real trajectory data, leveraging trajectory distance metrics and top-k sampling for gradient updatesโ€”without requiring reinforcement learning or adversarial imitation. This approach effectively mitigates distributional shift. Empirically, our method achieves state-of-the-art performance on the Waymo Sim Agents Challenge, surpassing a 102M-parameter baseline with only a 7M-parameter model under identical architecture. The implementation is publicly available.

Technology Category

Application Category

๐Ÿ“ Abstract
Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world. Inspired by large language models, tokenized multi-agent policies have recently become the state-of-the-art in traffic simulation. However, they are typically trained through open-loop behavior cloning, and thus suffer from covariate shift when executed in closed-loop during simulation. In this work, we present Closest Among Top-K (CAT-K) rollouts, a simple yet effective closed-loop fine-tuning strategy to mitigate covariate shift. CAT-K fine-tuning only requires existing trajectory data, without reinforcement learning or generative adversarial imitation. Concretely, CAT-K fine-tuning enables a small 7M-parameter tokenized traffic simulation policy to outperform a 102M-parameter model from the same model family, achieving the top spot on the Waymo Sim Agent Challenge leaderboard at the time of submission. The code is available at https://github.com/NVlabs/catk.
Problem

Research questions and friction points this paper is trying to address.

Mitigate covariate shift in tokenized traffic models
Improve closed-loop performance without reinforcement learning
Enhance traffic simulation accuracy using CAT-K fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Closed-loop fine-tuning for traffic models
CAT-K rollouts mitigate covariate shift
Tokenized multi-agent policies outperform larger models
๐Ÿ”Ž Similar Papers
No similar papers found.