Robust Autonomy Emerges from Self-Play

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the limitations of human-driving-data-dependent approaches in autonomous driving by proposing a pure self-play end-to-end training paradigm—the first to empirically demonstrate that robust, naturalistic driving policies can emerge solely from agent–agent adversarial interaction within simulation. Methodologically, we introduce Gigaflow, a high-throughput batch simulator generating 42 subjective driver-years per hour on a single node; integrate large-scale distributed self-play reinforcement learning with synthetic scenario modeling; and train a lightweight neural driving policy. Contributions include: (1) theoretical and empirical validation that pure self-play induces real-world robustness; (2) state-of-the-art performance across three major autonomous driving benchmarks; (3) superior generalization on real-world video-based test scenarios versus prior best methods; and (4) an average accident-free driving duration of 17.5 simulated years in closed-loop evaluation.

Technology Category

Application Category

📝 Abstract
Self-play has powered breakthroughs in two-player and multi-player games. Here we show that self-play is a surprisingly effective strategy in another domain. We show that robust and naturalistic driving emerges entirely from self-play in simulation at unprecedented scale -- 1.6~billion~km of driving. This is enabled by Gigaflow, a batched simulator that can synthesize and train on 42 years of subjective driving experience per hour on a single 8-GPU node. The resulting policy achieves state-of-the-art performance on three independent autonomous driving benchmarks. The policy outperforms the prior state of the art when tested on recorded real-world scenarios, amidst human drivers, without ever seeing human data during training. The policy is realistic when assessed against human references and achieves unprecedented robustness, averaging 17.5 years of continuous driving between incidents in simulation.
Problem

Research questions and friction points this paper is trying to address.

Self-play in autonomous driving simulation
Robust driving policy without human data
Unprecedented scale of simulated driving experience
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-play in simulation
Gigaflow batched simulator
State-of-the-art driving policy
🔎 Similar Papers
No similar papers found.