ReconDreamer-RL: Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the sim-to-real gap in end-to-end autonomous driving reinforcement learning (RL), this paper proposes ReconSimulator—a high-fidelity, interactive simulation framework integrating video diffusion priors with kinematic modeling. Methodologically, it innovatively employs a video diffusion model for photorealistic driving scene appearance reconstruction while enforcing physical consistency via coupled kinematic constraints. To enhance robustness, a dynamic adversarial agent generates rare, extreme traffic scenarios (e.g., aggressive cut-ins), and a “cousin trajectory generator” mitigates training data distribution bias by synthesizing semantically similar yet diverse behavioral trajectories. Experimental results demonstrate that ReconSimulator significantly improves closed-loop RL training: compared to imitation learning baselines, collision rates decrease fivefold, while both generalization capability and safety metrics show concurrent improvement across unseen environments and traffic conditions.

Technology Category

Application Category

📝 Abstract
Reinforcement learning for training end-to-end autonomous driving models in closed-loop simulations is gaining growing attention. However, most simulation environments differ significantly from real-world conditions, creating a substantial simulation-to-reality (sim2real) gap. To bridge this gap, some approaches utilize scene reconstruction techniques to create photorealistic environments as a simulator. While this improves realistic sensor simulation, these methods are inherently constrained by the distribution of the training data, making it difficult to render high-quality sensor data for novel trajectories or corner case scenarios. Therefore, we propose ReconDreamer-RL, a framework designed to integrate video diffusion priors into scene reconstruction to aid reinforcement learning, thereby enhancing end-to-end autonomous driving training. Specifically, in ReconDreamer-RL, we introduce ReconSimulator, which combines the video diffusion prior for appearance modeling and incorporates a kinematic model for physical modeling, thereby reconstructing driving scenarios from real-world data. This narrows the sim2real gap for closed-loop evaluation and reinforcement learning. To cover more corner-case scenarios, we introduce the Dynamic Adversary Agent (DAA), which adjusts the trajectories of surrounding vehicles relative to the ego vehicle, autonomously generating corner-case traffic scenarios (e.g., cut-in). Finally, the Cousin Trajectory Generator (CTG) is proposed to address the issue of training data distribution, which is often biased toward simple straight-line movements. Experiments show that ReconDreamer-RL improves end-to-end autonomous driving training, outperforming imitation learning methods with a 5x reduction in the Collision Ratio.
Problem

Research questions and friction points this paper is trying to address.

Bridges sim2real gap in autonomous driving training
Generates corner-case scenarios for robust RL training
Addresses biased training data distribution in simulations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates video diffusion priors for scene reconstruction
Uses Dynamic Adversary Agent for corner-case scenarios
Introduces Cousin Trajectory Generator for data diversity
🔎 Similar Papers
No similar papers found.