🤖 AI Summary
This work addresses the challenge of generating temporally coherent, high-fidelity animations from a single character artwork image, with precise camera trajectory control. We propose a two-stage inpainting paradigm: first constructing a camera-aware Gaussian scene field to model stable background motion, then injecting pose-aware character dynamics for controllable character animation. Key contributions include: (i) the first decoupling of single-image animation into “camera-aware scene inpainting” and “pose-aware video inpainting”; and (ii) a gated DiT-based video diffusion model that adaptively fuses character appearance, skeletal pose, and background video features. Our method integrates Stable Diffusion–based image inpainting, optimizable Gaussian splatting, DiT-based video generation, and pose-conditioned encoding. Experiments demonstrate superior temporal coherence and visual fidelity under complex camera motions, and significant improvements over state-of-the-art single-image animation methods across diverse character styles and scenes.
📝 Abstract
This paper presents DreamDance, a novel character art animation framework capable of producing stable, consistent character and scene motion conditioned on precise camera trajectories. To achieve this, we re-formulate the animation task as two inpainting-based steps: Camera-aware Scene Inpainting and Pose-aware Video Inpainting. The first step leverages a pre-trained image inpainting model to generate multi-view scene images from the reference art and optimizes a stable large-scale Gaussian field, which enables coarse background video rendering with camera trajectories. However, the rendered video is rough and only conveys scene motion. To resolve this, the second step trains a pose-aware video inpainting model that injects the dynamic character into the scene video while enhancing background quality. Specifically, this model is a DiT-based video generation model with a gating strategy that adaptively integrates the character's appearance and pose information into the base background video. Through extensive experiments, we demonstrate the effectiveness and generalizability of DreamDance, producing high-quality and consistent character animations with remarkable camera dynamics.