VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Existing methods only support static image generation from hand-drawn sketches, lacking controllable video animation synthesis. This paper introduces the first sketch-driven high-fidelity video generation framework, enabling non-expert users to directly generate dynamic content from arbitrary hand-drawn sketches and brief text prompts. Methodologically, we propose a Level-Based Sketch Control strategy and a Temporal-Spatial Attention mechanism to adaptively modulate guidance strength across varying user sketching proficiencies and significantly improve inter-frame temporal coherence. Built upon diffusion models, our framework supports zero-shot transfer and multi-sketch joint control. Quantitative and qualitative evaluations demonstrate that our approach substantially outperforms prior arts in both visual fidelity and temporal consistency, achieving— for the first time—the end-to-end, controllable generation of videos from sketches.

Technology Category

Application Category

📝 Abstract

With the advancement of generative artificial intelligence, previous studies have achieved the task of generating aesthetic images from hand-drawn sketches, fulfilling the public's needs for drawing. However, these methods are limited to static images and lack the ability to control video animation generation using hand-drawn sketches. To address this gap, we propose VidSketch, the first method capable of generating high-quality video animations directly from any number of hand-drawn sketches and simple text prompts, bridging the divide between ordinary users and professional artists. Specifically, our method introduces a Level-Based Sketch Control Strategy to automatically adjust the guidance strength of sketches during the generation process, accommodating users with varying drawing skills. Furthermore, a TempSpatial Attention mechanism is designed to enhance the spatiotemporal consistency of generated video animations, significantly improving the coherence across frames. You can find more detailed cases on our official website.

Problem

Research questions and friction points this paper is trying to address.

Sketch-based Animation

Video Generation

Interactive Control

Innovation

Methods, ideas, or system contributions that make the work stand out.

VidSketch

Sketch-based Animation

User Skill Adaptation

🔎 Similar Papers

No similar papers found.

Apple

Cupertino, United States of America

Research Engineer/Scientist (all levels), World Models

TikTok

San Jose, California

AI Research Scientist, Computer Vision - Facebook Video Intelligence