SimScale: Learning to Drive via Real-World Simulation at Scale

📅 2025-11-28

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

To address insufficient generalization of autonomous driving decision-making caused by the scarcity of safety-critical and out-of-distribution scenarios in real-world driving data, this paper proposes SimScale—a simulation framework integrating neural rendering for multi-view high-fidelity observation generation, reactive environment modeling, pseudo-expert trajectory synthesis, and real-simulation co-training. It enables large-scale synthesis of unseen states grounded in real-world logs. Our key contribution is establishing a smooth, scalable relationship between simulation data scale and policy performance, empirically validated across scales. On the NavHard and NavTest benchmarks, SimScale improves planning performance by +6.8 EPDMS and +2.9, respectively, without requiring additional real-world data. This enables continuous optimization and significantly enhances decision robustness in long-tail and complex scenarios.

Technology Category

Application Category

📝 Abstract

Achieving fully autonomous driving systems requires learning rational decisions in a wide span of scenarios, including safety-critical and out-of-distribution ones. However, such cases are underrepresented in real-world corpus collected by human experts. To complement for the lack of data diversity, we introduce a novel and scalable simulation framework capable of synthesizing massive unseen states upon existing driving logs. Our pipeline utilizes advanced neural rendering with a reactive environment to generate high-fidelity multi-view observations controlled by the perturbed ego trajectory. Furthermore, we develop a pseudo-expert trajectory generation mechanism for these newly simulated states to provide action supervision. Upon the synthesized data, we find that a simple co-training strategy on both real-world and simulated samples can lead to significant improvements in both robustness and generalization for various planning methods on challenging real-world benchmarks, up to +6.8 EPDMS on navhard and +2.9 on navtest. More importantly, such policy improvement scales smoothly by increasing simulation data only, even without extra real-world data streaming in. We further reveal several crucial findings of such a sim-real learning system, which we term SimScale, including the design of pseudo-experts and the scaling properties for different policy architectures. Our simulation data and code would be released.

Problem

Research questions and friction points this paper is trying to address.

Addresses lack of diverse safety-critical scenarios in autonomous driving datasets

Develops scalable simulation framework to synthesize unseen driving states

Improves planning robustness through co-training on real and simulated data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural rendering generates high-fidelity multi-view observations

Pseudo-expert trajectory mechanism provides action supervision

Co-training strategy combines real and simulated data

🔎 Similar Papers

No similar papers found.