InstructHumans: Editing Animated 3D Human Textures with Instructions

📅 2024-04-05
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
Existing text-driven 3D editing methods suffer from a trade-off between source avatar consistency and textual instruction fidelity when directly applying Score Distillation Sampling (SDS), often resulting in geometric/artifact distortions or texture blurring. To address this, we propose Editing-oriented Score Distillation Sampling (SDS-E), which (i) selectively fuses gradient terms during the diffusion process to preserve structural integrity; (ii) introduces a spatial smoothing regularization to enforce texture continuity; and (iii) designs a gradient-guided view sampling strategy to enhance multi-view consistency. Crucially, SDS-E maintains the original 3D human geometry and animation sequence unchanged while significantly improving texture edit sharpness and semantic fidelity. Extensive experiments demonstrate that SDS-E outperforms state-of-the-art 3D text-to-3D editing methods both qualitatively and quantitatively across standard metrics.

Technology Category

Application Category

📝 Abstract
We present InstructHumans, a novel framework for instruction-driven 3D human texture editing. Existing text-based editing methods use Score Distillation Sampling (SDS) to distill guidance from generative models. This work shows that naively using such scores is harmful to editing as they destroy consistency with the source avatar. Instead, we propose an alternate SDS for Editing (SDS-E) that selectively incorporates subterms of SDS across diffusion timesteps. We further enhance SDS-E with spatial smoothness regularization and gradient-based viewpoint sampling to achieve high-quality edits with sharp and high-fidelity detailing. InstructHumans significantly outperforms existing 3D editing methods, consistent with the initial avatar while faithful to the textual instructions. Project page: https://jyzhu.top/instruct-humans .
Problem

Research questions and friction points this paper is trying to address.

Editing 3D human textures with text instructions
Maintaining consistency with source avatar during edits
Improving sharpness and fidelity of texture details
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modified SDS-E for editing consistency
Spatial smoothness regularization for sharp details
Gradient-based viewpoint sampling for fidelity
🔎 Similar Papers
No similar papers found.