ChildlikeSHAPES: Semantic Hierarchical Region Parsing for Animating Figure Drawings

📅 2025-04-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Semantic region parsing of child-style sketch drawings is challenging, and style distortion frequently occurs during animation generation. Method: We propose the first semantic hierarchical segmentation model tailored for children’s drawings. Built upon the SAM architecture, our approach introduces a hierarchical fine-tuning framework that integrates semantic-prior-guided region parsing with cross-domain generalization strategies. We also construct the first large-scale children’s drawing dataset—comprising 16,000 images with pixel-level annotations across 25 semantic classes. Contribution/Results: Our model achieves significant improvements over state-of-the-art methods on children’s drawing segmentation. It enables fully automatic facial animation, relighting, and animation enhancement while preserving stylistic fidelity. Notably, it generalizes effectively to out-of-domain hand-drawn human figures. This work establishes a novel paradigm for intelligent, style-consistent animation generation from children’s sketches.

Technology Category

Application Category

📝 Abstract
Childlike human figure drawings represent one of humanity's most accessible forms of character expression, yet automatically analyzing their contents remains a significant challenge. While semantic segmentation of realistic humans has recently advanced considerably, existing models often fail when confronted with the abstract, representational nature of childlike drawings. This semantic understanding is a crucial prerequisite for animation tools that seek to modify figures while preserving their unique style. To help achieve this, we propose a novel hierarchical segmentation model, built upon the architecture and pre-trained SAM, to quickly and accurately obtain these semantic labels. Our model achieves higher accuracy than state-of-the-art segmentation models focused on realistic humans and cartoon figures, even after fine-tuning. We demonstrate the value of our model for semantic segmentation through multiple applications: a fully automatic facial animation pipeline, a figure relighting pipeline, improvements to an existing childlike human figure drawing animation method, and generalization to out-of-domain figures. Finally, to support future work in this area, we introduce a dataset of 16,000 childlike drawings with pixel-level annotations across 25 semantic categories. Our work can enable entirely new, easily accessible tools for hand-drawn character animation, and our dataset can enable new lines of inquiry in a variety of graphics and human-centric research fields.
Problem

Research questions and friction points this paper is trying to address.

Automatically analyzing abstract childlike human figure drawings
Improving semantic segmentation for animation tools preserving style
Creating a dataset for future research in graphics and animation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical segmentation model based on SAM
Higher accuracy than state-of-the-art models
Dataset of 16,000 annotated childlike drawings
🔎 Similar Papers
No similar papers found.