WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work addresses the low fidelity of occluded regions (e.g., back and lateral views) in monocular dynamic human 3D reconstruction. We propose a dual-space optimization framework that pioneers the integration of Score Distillation Sampling (SDS) across canonical and observation spaces. Our method jointly leverages 2D diffusion-based generative priors, differentiable rendering, pose-aware feature modulation, and multi-view consistency constraints, augmented by a view-selection strategy to enhance visual coherence. The core innovation is a pose-guided, cross-view joint optimization mechanism for geometry and appearance. Experiments demonstrate that our approach significantly improves reconstruction quality of unseen regions from monocular input, achieving state-of-the-art photorealism and enabling high-fidelity dynamic human avatars.

Technology Category

Application Category

📝 Abstract

In this paper, we present WonderHuman to reconstruct dynamic human avatars from a monocular video for high-fidelity novel view synthesis. Previous dynamic human avatar reconstruction methods typically require the input video to have full coverage of the observed human body. However, in daily practice, one typically has access to limited viewpoints, such as monocular front-view videos, making it a cumbersome task for previous methods to reconstruct the unseen parts of the human avatar. To tackle the issue, we present WonderHuman, which leverages 2D generative diffusion model priors to achieve high-quality, photorealistic reconstructions of dynamic human avatars from monocular videos, including accurate rendering of unseen body parts. Our approach introduces a Dual-Space Optimization technique, applying Score Distillation Sampling (SDS) in both canonical and observation spaces to ensure visual consistency and enhance realism in dynamic human reconstruction. Additionally, we present a View Selection strategy and Pose Feature Injection to enforce the consistency between SDS predictions and observed data, ensuring pose-dependent effects and higher fidelity in the reconstructed avatar. In the experiments, our method achieves SOTA performance in producing photorealistic renderings from the given monocular video, particularly for those challenging unseen parts. The project page and source code can be found at https://wyiguanw.github.io/WonderHuman/.

Problem

Research questions and friction points this paper is trying to address.

Single-view Video

High-fidelity Human Model Reconstruction

Unseen Body Parts

Innovation

Methods, ideas, or system contributions that make the work stand out.

WonderHuman

Single-view Video

High-precision Human Model Reconstruction

🔎 Similar Papers

No similar papers found.