InstructHumans: Editing Animated 3D Human Textures with Instructions

📅 2024-04-05

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Existing text-driven 3D editing methods suffer from a trade-off between source avatar consistency and textual instruction fidelity when directly applying Score Distillation Sampling (SDS), often resulting in geometric/artifact distortions or texture blurring. To address this, we propose Editing-oriented Score Distillation Sampling (SDS-E), which (i) selectively fuses gradient terms during the diffusion process to preserve structural integrity; (ii) introduces a spatial smoothing regularization to enforce texture continuity; and (iii) designs a gradient-guided view sampling strategy to enhance multi-view consistency. Crucially, SDS-E maintains the original 3D human geometry and animation sequence unchanged while significantly improving texture edit sharpness and semantic fidelity. Extensive experiments demonstrate that SDS-E outperforms state-of-the-art 3D text-to-3D editing methods both qualitatively and quantitatively across standard metrics.

Technology Category

Application Category

📝 Abstract

We present InstructHumans, a novel framework for instruction-driven 3D human texture editing. Existing text-based editing methods use Score Distillation Sampling (SDS) to distill guidance from generative models. This work shows that naively using such scores is harmful to editing as they destroy consistency with the source avatar. Instead, we propose an alternate SDS for Editing (SDS-E) that selectively incorporates subterms of SDS across diffusion timesteps. We further enhance SDS-E with spatial smoothness regularization and gradient-based viewpoint sampling to achieve high-quality edits with sharp and high-fidelity detailing. InstructHumans significantly outperforms existing 3D editing methods, consistent with the initial avatar while faithful to the textual instructions. Project page: https://jyzhu.top/instruct-humans .

Problem

Research questions and friction points this paper is trying to address.

Editing 3D human textures with text instructions

Maintaining consistency with source avatar during edits

Improving sharpness and fidelity of texture details

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modified SDS-E for editing consistency

Spatial smoothness regularization for sharp details

Gradient-based viewpoint sampling for fidelity

🔎 Similar Papers

No similar papers found.