🤖 AI Summary
This work addresses the challenge of simultaneously achieving infinite scalability, precise geometric/color constraints, and high-fidelity motion in vector graphic animation generation. We propose a novel method integrating implicit neural representations with text-to-video diffusion models. Our approach introduces a hierarchical implicit neural representation that explicitly encodes paths, shapes, and deformations—thereby bridging the semantic gap between the vector domain and diffusion-based video synthesis. To ensure temporal coherence, we incorporate Video Score Distillation Sampling (VSDS), and employ differentiable deformation mapping to preserve structural integrity during motion synthesis. Experiments demonstrate significant improvements over prior methods in animation naturalness, geometric and chromatic fidelity, and editing flexibility. To our knowledge, this is the first framework enabling end-to-end, controllable, high-fidelity vector graphic animation generation.
📝 Abstract
Vector graphics, known for their scalability and user-friendliness, provide a unique approach to visual content compared to traditional pixel-based images. Animation of these graphics, driven by the motion of their elements, offers enhanced comprehensibility and controllability but often requires substantial manual effort. To automate this process, we propose a novel method that integrates implicit neural representations with text-to-video diffusion models for vector graphic animation. Our approach employs layered implicit neural representations to reconstruct vector graphics, preserving their inherent properties such as infinite resolution and precise color and shape constraints, which effectively bridges the large domain gap between vector graphics and diffusion models. The neural representations are then optimized using video score distillation sampling, which leverages motion priors from pretrained text-to-video diffusion models. Finally, the vector graphics are warped to match the representations resulting in smooth animation. Experimental results validate the effectiveness of our method in generating vivid and natural vector graphic animations, demonstrating significant improvement over existing techniques that suffer from limitations in flexibility and animation quality.