🤖 AI Summary
Existing AI image generation tools produce high-quality outputs but lack modeling of human stepwise artistic creation, limiting their utility in art pedagogy. Method: We propose a four-stage progressive drawing framework that integrates a Stable Diffusion variant with an LLM-driven multi-turn instructional dialogue system. The diffusion process is decoupled into interpretable intermediate representations—such as anatomy, perspective, and composition—and supports nonlinear editing and branching creative exploration. Contribution/Results: We introduce the “generation-as-instruction” paradigm, transforming AI from a content generator into an educational creative scaffold. User studies with novice artists show that 87% successfully transitioned to autonomous composition, structural error rates decreased by 32%, and both conceptual understanding and practical drawing proficiency improved significantly.
📝 Abstract
While current AI illustration tools can generate high-quality images from text prompts, they rarely reveal the step-by-step procedure that human artists follow. We present SakugaFlow, a four-stage pipeline that pairs diffusion-based image generation with a large-language-model tutor. At each stage, novices receive real-time feedback on anatomy, perspective, and composition, revise any step non-linearly, and branch alternative versions. By exposing intermediate outputs and embedding pedagogical dialogue, SakugaFlow turns a black-box generator into a scaffolded learning environment that supports both creative exploration and skills acquisition.