ReconDreamer-RL: Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address the sim-to-real gap in end-to-end autonomous driving reinforcement learning (RL), this paper proposes ReconSimulator—a high-fidelity, interactive simulation framework integrating video diffusion priors with kinematic modeling. Methodologically, it innovatively employs a video diffusion model for photorealistic driving scene appearance reconstruction while enforcing physical consistency via coupled kinematic constraints. To enhance robustness, a dynamic adversarial agent generates rare, extreme traffic scenarios (e.g., aggressive cut-ins), and a “cousin trajectory generator” mitigates training data distribution bias by synthesizing semantically similar yet diverse behavioral trajectories. Experimental results demonstrate that ReconSimulator significantly improves closed-loop RL training: compared to imitation learning baselines, collision rates decrease fivefold, while both generalization capability and safety metrics show concurrent improvement across unseen environments and traffic conditions.

Technology Category

Application Category

📝 Abstract

Reinforcement learning for training end-to-end autonomous driving models in closed-loop simulations is gaining growing attention. However, most simulation environments differ significantly from real-world conditions, creating a substantial simulation-to-reality (sim2real) gap. To bridge this gap, some approaches utilize scene reconstruction techniques to create photorealistic environments as a simulator. While this improves realistic sensor simulation, these methods are inherently constrained by the distribution of the training data, making it difficult to render high-quality sensor data for novel trajectories or corner case scenarios. Therefore, we propose ReconDreamer-RL, a framework designed to integrate video diffusion priors into scene reconstruction to aid reinforcement learning, thereby enhancing end-to-end autonomous driving training. Specifically, in ReconDreamer-RL, we introduce ReconSimulator, which combines the video diffusion prior for appearance modeling and incorporates a kinematic model for physical modeling, thereby reconstructing driving scenarios from real-world data. This narrows the sim2real gap for closed-loop evaluation and reinforcement learning. To cover more corner-case scenarios, we introduce the Dynamic Adversary Agent (DAA), which adjusts the trajectories of surrounding vehicles relative to the ego vehicle, autonomously generating corner-case traffic scenarios (e.g., cut-in). Finally, the Cousin Trajectory Generator (CTG) is proposed to address the issue of training data distribution, which is often biased toward simple straight-line movements. Experiments show that ReconDreamer-RL improves end-to-end autonomous driving training, outperforming imitation learning methods with a 5x reduction in the Collision Ratio.

Problem

Research questions and friction points this paper is trying to address.

Bridges sim2real gap in autonomous driving training

Generates corner-case scenarios for robust RL training

Addresses biased training data distribution in simulations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates video diffusion priors for scene reconstruction

Uses Dynamic Adversary Agent for corner-case scenarios

Introduces Cousin Trajectory Generator for data diversity

🔎 Similar Papers

No similar papers found.