DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer

📅 2026-02-27

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing neural reconstruction methods, such as NeRF and 3D Gaussian Splatting, often produce artifacts in novel view synthesis and struggle to realistically integrate dynamic objects across scenes, limiting simulation fidelity. This work proposes an online generative enhancement framework that, for the first time, adapts a pretrained multi-step image diffusion model into a single-step temporally conditioned enhancer. By leveraging specially curated synthetic-to-real image pairs, the method corrects appearance inconsistencies, artifacts, and illumination distortions in real time. While maintaining efficient GPU inference, the approach significantly improves photorealism and temporal coherence of rendered sequences, offering a high-fidelity and scalable simulation solution for applications such as autonomous driving.

Technology Category

Application Category

📝 Abstract

Simulation is essential to the development and evaluation of autonomous robots such as self-driving vehicles. Neural reconstruction is emerging as a promising solution as it enables simulating a wide variety of scenarios from real-world data alone in an automated and scalable way. However, while methods such as NeRF and 3D Gaussian Splatting can produce visually compelling results, they often exhibit artifacts particularly when rendering novel views, and fail to realistically integrate inserted dynamic objects, especially when they were captured from different scenes. To overcome these limitations, we introduce DiffusionHarmonizer, an online generative enhancement framework that transforms renderings from such imperfect scenes into temporally consistent outputs while improving their realism. At its core is a single-step temporally-conditioned enhancer that is converted from a pretrained multi-step image diffusion model, capable of running in online simulators on a single GPU. The key to training it effectively is a custom data curation pipeline that constructs synthetic-real pairs emphasizing appearance harmonization, artifact correction, and lighting realism. The result is a scalable system that significantly elevates simulation fidelity in both research and production environments.

Problem

Research questions and friction points this paper is trying to address.

neural reconstruction

photorealistic simulation

view synthesis artifacts

dynamic object integration

simulation fidelity

Innovation

Methods, ideas, or system contributions that make the work stand out.

DiffusionHarmonizer

neural reconstruction

online diffusion enhancement