DogWeave: High-Fidelity 3D Canine Reconstruction from a Single Image via Normal Fusion and Conditional Inpainting

📅 2026-03-08

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Monocular 3D animal reconstruction remains challenging due to complex poses, severe self-occlusions, intricate fur details, and the absence of 3D supervision with pose annotations or multi-view data—particularly from the back—often resulting in geometric distortions and texture inconsistencies. To address these issues, this work proposes a novel framework that refines a parametric mesh into a high-fidelity signed distance field (SDF) geometry through diffusion-enhanced multi-view normal field optimization. It further generates view-consistent, photorealistic textures by integrating structure- and style-guided conditional local inpainting. Using only approximately 7,000 unannotated dog images without any 3D labels, the method outperforms current state-of-the-art approaches in both geometric accuracy and texture realism, achieving complete and lifelike 3D reconstructions of dogs.

Technology Category

Application Category

📝 Abstract

Monocular 3D animal reconstruction is challenging due to complex articulation, self-occlusion, and fine-scale details such as fur. Existing methods often produce distorted geometry and inconsistent textures due to the lack of articulated 3D supervision and limited availability of back-view images in 2D datasets, which makes reconstructing unobserved regions particularly difficult. To address these limitations, we propose DogWeave, a model-based framework for reconstructing high-fidelity 3D canine models from a single RGB image. DogWeave improves geometry by refining a coarsely-initiated parametric mesh into a detailed SDF representation through multi-view normal field optimization using diffusion-enhanced normals. It then generates view-consistent textures through conditional partial inpainting guided by structure and style cues, enabling realistic reconstruction of unobserved regions. Using only about 7,000 dog images processed via our 2D pipeline for training, DogWeave produces complete, realistic 3D models and outperforms state-of-the-art single image to 3d reconstruction methods in both shape accuracy and texture realism for canines.

Problem

Research questions and friction points this paper is trying to address.

monocular 3D reconstruction

canine modeling

self-occlusion

texture consistency

fine-scale details

Innovation

Methods, ideas, or system contributions that make the work stand out.

Normal Fusion

Conditional Inpainting

SDF Representation