Multi-Object Sketch Animation by Scene Decomposition and Motion Planning

๐Ÿ“… 2025-03-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing sketch animation methods struggle to model object-aware motion and jointly optimize complex dynamics in multi-object scenes. This paper proposes the first end-to-end, zero-shot framework for generating animated GIFs and short videos from hand-drawn multi-object sketches. Our method first leverages a large language model (LLM) for semantic-driven scene decomposition and coarse motion planning. It then introduces a compositional Score Distillation Sampling (SDS) mechanism, integrating differentiable compositing rendering with a motion-refinement neural network to enable object-level motion modeling and joint optimization. Crucially, the framework requires no training dataโ€”animation is synthesized iteratively via SDS, guided solely by text prompts. Experiments demonstrate substantial improvements over prior art in animation quality, motion plausibility, and object independence. Our approach systematically addresses the two core challenges in multi-object sketch animation: object-aware motion modeling and optimization of intricate, interdependent dynamics.

Technology Category

Application Category

๐Ÿ“ Abstract
Sketch animation, which brings static sketches to life by generating dynamic video sequences, has found widespread applications in GIF design, cartoon production, and daily entertainment. While current sketch animation methods perform well in single-object sketch animation, they struggle in multi-object scenarios. By analyzing their failures, we summarize two challenges of transitioning from single-object to multi-object sketch animation: object-aware motion modeling and complex motion optimization. For multi-object sketch animation, we propose MoSketch based on iterative optimization through Score Distillation Sampling (SDS), without any other data for training. We propose four modules: LLM-based scene decomposition, LLM-based motion planning, motion refinement network and compositional SDS, to tackle the two challenges in a divide-and-conquer strategy. Extensive qualitative and quantitative experiments demonstrate the superiority of our method over existing sketch animation approaches. MoSketch takes a pioneering step towards multi-object sketch animation, opening new avenues for future research and applications. The code will be released.
Problem

Research questions and friction points this paper is trying to address.

Challenges in multi-object sketch animation motion modeling
Difficulties in complex motion optimization for sketches
Lack of training data for multi-object animation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based scene decomposition and motion planning
Motion refinement network for complex optimization
Compositional SDS for iterative optimization
๐Ÿ”Ž Similar Papers
2024-07-16Neural Information Processing SystemsCitations: 16
J
Jingyu Liu
Renmin University of China
Zijie Xin
Zijie Xin
Renmin University of China
Video understandingMulti-modal learningCross-modal retrievalComputer Vision
Y
Yuhan Fu
Renmin University of China
Ruixiang Zhao
Ruixiang Zhao
Renmin University of China
B
Bangxiang Lan
Renmin University of China
X
Xirong Li
Renmin University of China