3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

📅 2024-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D generation methods suffer from limited availability of high-quality 3D data and insufficient modeling capacity of multi-view architectures, resulting in low-resolution outputs and inconsistent geometry and appearance across views. To address these challenges, we propose MVDiff—a multi-view latent diffusion enhancement framework—that introduces, for the first time, a pose-aware encoder and an epipolar-aware multi-view attention mechanism. Operating without video temporal constraints, MVDiff enables high-fidelity, cross-view consistent refinement of coarse-grained 3D inputs. By synergistically integrating synthetic data augmentation with diffusion-based denoising, our method simultaneously overcomes resolution and consistency bottlenecks. Extensive experiments demonstrate that MVDiff significantly outperforms state-of-the-art approaches on both multi-view enhancement and single-instance 3D optimization tasks, achieving substantial improvements in output resolution as well as geometric and appearance consistency across views.

Technology Category

Application Category

📝 Abstract
Despite advances in neural rendering, due to the scarcity of high-quality 3D datasets and the inherent limitations of multi-view diffusion models, view synthesis and 3D model generation are restricted to low resolutions with suboptimal multi-view consistency. In this study, we present a novel 3D enhancement pipeline, dubbed 3DEnhancer, which employs a multi-view latent diffusion model to enhance coarse 3D inputs while preserving multi-view consistency. Our method includes a pose-aware encoder and a diffusion-based denoiser to refine low-quality multi-view images, along with data augmentation and a multi-view attention module with epipolar aggregation to maintain consistent, high-quality 3D outputs across views. Unlike existing video-based approaches, our model supports seamless multi-view enhancement with improved coherence across diverse viewing angles. Extensive evaluations show that 3DEnhancer significantly outperforms existing methods, boosting both multi-view enhancement and per-instance 3D optimization tasks.
Problem

Research questions and friction points this paper is trying to address.

3D Data Scarcity
Multi-View Modeling Limitations
Low-Resolution 3D Visualization
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Image Quality Enhancement
Multi-View Modeling
Attention Module
🔎 Similar Papers
No similar papers found.