AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current video diffusion models struggle to generate human animations in open-domain dynamic backgrounds due to coarse-grained pose conditioning (e.g., DWPose/SMPL-X), resulting in poor character-background alignment and inaccurate motion following. To address this, we propose AvatarDiff, a human-centric animation model that introduces a novel avatar-background joint conditioning mechanism, reframing animation generation as a conditional inpainting task. Built upon an image-to-video diffusion architecture, AvatarDiff integrates multimodal motion representations (DWPose and SMPL-X) with a dual-path conditional encoder, significantly improving temporal coherence and cross-scene generalization under dynamic backgrounds. Extensive evaluation across diverse scenarios demonstrates that AvatarDiff produces high-fidelity, motion-accurate human animations, outperforming existing state-of-the-art methods. The code is publicly available.

Technology Category

Application Category

📝 Abstract
Recent advances in video diffusion models have significantly improved character animation techniques. However, current approaches rely on basic structural conditions such as DWPose or SMPL-X to animate character images, limiting their effectiveness in open-domain scenarios with dynamic backgrounds or challenging human poses. In this paper, we introduce $ extbf{AniCrafter}$, a diffusion-based human-centric animation model that can seamlessly integrate and animate a given character into open-domain dynamic backgrounds while following given human motion sequences. Built on cutting-edge Image-to-Video (I2V) diffusion architectures, our model incorporates an innovative"avatar-background"conditioning mechanism that reframes open-domain human-centric animation as a restoration task, enabling more stable and versatile animation outputs. Experimental results demonstrate the superior performance of our method. Codes will be available at https://github.com/MyNiuuu/AniCrafter.
Problem

Research questions and friction points this paper is trying to address.

Limitations in animating characters with dynamic backgrounds
Challenges in handling complex human poses effectively
Need for stable and versatile human-centric animation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Avatar-background conditioning mechanism
Restoration task reframing approach
Image-to-Video diffusion architecture
🔎 Similar Papers
No similar papers found.