Moyun: A Diffusion-Based Model for Style-Specific Chinese Calligraphy Generation

📅 2024-10-10

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Generating high-fidelity, style-consistent Chinese calligraphy for unseen characters remains challenging due to the scarcity of training samples for rare or unobserved glyphs. Method: We propose a style-controllable Chinese calligraphy generation framework that replaces the conventional U-Net backbone in diffusion models with Vision Mamba and introduces a novel TripleLabel conditioning mechanism—jointly encoding calligrapher identity, font category, and character-level stylistic attributes—to achieve fine-grained disentanglement and coordinated control. Contribution/Results: Trained and evaluated on Mobao, our large-scale, self-collected calligraphy dataset (1.9M+ images), the model accurately reproduces target calligraphers’ stroke morphology, structural composition, and artistic spirit—even for unseen characters. Quantitative and qualitative experiments demonstrate significant improvements in style fidelity and cross-glyph generalization, establishing a new paradigm for controllable calligraphic generation and style transfer.

Technology Category

Application Category

📝 Abstract

Although Chinese calligraphy generation has achieved style transfer, generating calligraphy by specifying the calligrapher, font, and character style remains challenging. To address this, we propose a new Chinese calligraphy generation model 'Moyun' , which replaces the Unet in the Diffusion model with Vision Mamba and introduces the TripleLabel control mechanism to achieve controllable calligraphy generation. The model was tested on our large-scale dataset 'Mobao' of over 1.9 million images, and the results demonstrate that 'Moyun' can effectively control the generation process and produce calligraphy in the specified style. Even for calligraphy the calligrapher has not written, 'Moyun' can generate calligraphy that matches the style of the calligrapher.

Problem

Research questions and friction points this paper is trying to address.

Generates style-specific Chinese calligraphy controllably

Replaces Unet with Vision Mamba for improved generation

Uses TripleLabel mechanism to specify calligrapher, font, and style

Innovation

Methods, ideas, or system contributions that make the work stand out.

Replaces Unet with Vision Mamba in Diffusion model

Introduces TripleLabel control mechanism for style

Uses large-scale Mobao dataset with 1.9 million images

🔎 Similar Papers

Decoupling Layout from Glyph in Online Chinese Handwriting Generation