WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing single-image-to-interactive-3D-scene methods suffer from severe degradation in rendering quality and cross-view consistency during large-scale viewpoint navigation—particularly when extrapolating forward into unobserved regions—thus hindering free exploration. To address this, we propose a synergistic framework comprising WorldRestorer and ConsistView: WorldRestorer leverages a data-driven video inpainting model to eliminate view-dependent artifacts in novel viewpoints; ConsistView innovatively decouples novel-view synthesis quality from cross-view consistency, enforcing global spatial coherence via multi-view joint inpainting. Integrated with automated dataset construction and dual validation via CLIP-based metrics and user studies, our method significantly improves visual fidelity and consistency under large viewpoint displacements while retaining single-image input. User evaluation shows a 77.20% preference rate over WonderWorld, marking the first demonstration of seamless, immersive, single-image-driven 3D free exploration.

Technology Category

Application Category

📝 Abstract

Interactive 3D scene generation from a single image has gained significant attention due to its potential to create immersive virtual worlds. However, a key challenge in current 3D generation methods is the limited explorability, which cannot render high-quality images during larger maneuvers beyond the original viewpoint, particularly when attempting to move forward into unseen areas. To address this challenge, we propose WonderFree, the first model that enables users to interactively generate 3D worlds with the freedom to explore from arbitrary angles and directions. Specifically, we decouple this challenge into two key subproblems: novel view quality, which addresses visual artifacts and floating issues in novel views, and cross-view consistency, which ensures spatial consistency across different viewpoints. To enhance rendering quality in novel views, we introduce WorldRestorer, a data-driven video restoration model designed to eliminate floaters and artifacts. In addition, a data collection pipeline is presented to automatically gather training data for WorldRestorer, ensuring it can handle scenes with varying styles needed for 3D scene generation. Furthermore, to improve cross-view consistency, we propose ConsistView, a multi-view joint restoration mechanism that simultaneously restores multiple perspectives while maintaining spatiotemporal coherence. Experimental results demonstrate that WonderFree not only enhances rendering quality across diverse viewpoints but also significantly improves global coherence and consistency. These improvements are confirmed by CLIP-based metrics and a user study showing a 77.20% preference for WonderFree over WonderWorld enabling a seamless and immersive 3D exploration experience. The code, model, and data will be publicly available.

Problem

Research questions and friction points this paper is trying to address.

Improving novel view quality in 3D scene exploration

Ensuring cross-view consistency during 3D navigation

Enhancing interactivity and freedom in 3D world generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

WorldRestorer eliminates floaters and artifacts

Data pipeline collects varied training scenes

ConsistView maintains multi-view coherence

🔎 Similar Papers

NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods