SakugaFlow: A Stagewise Illustration Framework Emulating the Human Drawing Process and Providing Interactive Tutoring for Novice Drawing Skills

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI image generation tools produce high-quality outputs but lack modeling of human stepwise artistic creation, limiting their utility in art pedagogy. Method: We propose a four-stage progressive drawing framework that integrates a Stable Diffusion variant with an LLM-driven multi-turn instructional dialogue system. The diffusion process is decoupled into interpretable intermediate representations—such as anatomy, perspective, and composition—and supports nonlinear editing and branching creative exploration. Contribution/Results: We introduce the “generation-as-instruction” paradigm, transforming AI from a content generator into an educational creative scaffold. User studies with novice artists show that 87% successfully transitioned to autonomous composition, structural error rates decreased by 32%, and both conceptual understanding and practical drawing proficiency improved significantly.

Technology Category

Application Category

📝 Abstract
While current AI illustration tools can generate high-quality images from text prompts, they rarely reveal the step-by-step procedure that human artists follow. We present SakugaFlow, a four-stage pipeline that pairs diffusion-based image generation with a large-language-model tutor. At each stage, novices receive real-time feedback on anatomy, perspective, and composition, revise any step non-linearly, and branch alternative versions. By exposing intermediate outputs and embedding pedagogical dialogue, SakugaFlow turns a black-box generator into a scaffolded learning environment that supports both creative exploration and skills acquisition.
Problem

Research questions and friction points this paper is trying to address.

Emulate human drawing process for novice artists
Provide interactive tutoring for drawing skills
Combine AI generation with step-by-step feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

Four-stage pipeline with diffusion-based generation
Real-time feedback via large-language-model tutor
Non-linear revision and branching alternatives
🔎 Similar Papers
No similar papers found.
Kazuki Kawamura
Kazuki Kawamura
The University of Tokyo, Sony, Sony CSL Kyoto
Machine LearningAI-Guided LearningHuman-Computer InteractionHuman Augmentation
J
Jun Rekimoto
Sony Computer Science Laboratories, Inc., Japan and The University of Tokyo, Japan