Moyun: A Diffusion-Based Model for Style-Specific Chinese Calligraphy Generation

📅 2024-10-10
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Generating high-fidelity, style-consistent Chinese calligraphy for unseen characters remains challenging due to the scarcity of training samples for rare or unobserved glyphs. Method: We propose a style-controllable Chinese calligraphy generation framework that replaces the conventional U-Net backbone in diffusion models with Vision Mamba and introduces a novel TripleLabel conditioning mechanism—jointly encoding calligrapher identity, font category, and character-level stylistic attributes—to achieve fine-grained disentanglement and coordinated control. Contribution/Results: Trained and evaluated on Mobao, our large-scale, self-collected calligraphy dataset (1.9M+ images), the model accurately reproduces target calligraphers’ stroke morphology, structural composition, and artistic spirit—even for unseen characters. Quantitative and qualitative experiments demonstrate significant improvements in style fidelity and cross-glyph generalization, establishing a new paradigm for controllable calligraphic generation and style transfer.

Technology Category

Application Category

📝 Abstract
Although Chinese calligraphy generation has achieved style transfer, generating calligraphy by specifying the calligrapher, font, and character style remains challenging. To address this, we propose a new Chinese calligraphy generation model 'Moyun' , which replaces the Unet in the Diffusion model with Vision Mamba and introduces the TripleLabel control mechanism to achieve controllable calligraphy generation. The model was tested on our large-scale dataset 'Mobao' of over 1.9 million images, and the results demonstrate that 'Moyun' can effectively control the generation process and produce calligraphy in the specified style. Even for calligraphy the calligrapher has not written, 'Moyun' can generate calligraphy that matches the style of the calligrapher.
Problem

Research questions and friction points this paper is trying to address.

Generates style-specific Chinese calligraphy controllably
Replaces Unet with Vision Mamba for improved generation
Uses TripleLabel mechanism to specify calligrapher, font, and style
Innovation

Methods, ideas, or system contributions that make the work stand out.

Replaces Unet with Vision Mamba in Diffusion model
Introduces TripleLabel control mechanism for style
Uses large-scale Mobao dataset with 1.9 million images
🔎 Similar Papers
No similar papers found.