Calligrapher: Freestyle Text Image Customization

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low precision in font style control and heavy reliance on annotated data in digital calligraphy and design, this paper proposes a diffusion-based framework for free-form text image customization. Methodologically, we construct a style-centric glyph benchmark, introduce a trainable style encoder—integrating a large language model (LLM), QFormer, and linear projection—and incorporate a context-aware generation mechanism. Furthermore, we devise a self-distillation strategy to enable localized style injection from reference images into the denoising process and achieve fine-grained glyph–style alignment. Experiments demonstrate that our approach significantly improves style detail fidelity and glyph localization accuracy across diverse fonts and design scenarios. Both generation quality and cross-sample visual consistency surpass those of existing methods.

Technology Category

Application Category

📝 Abstract
We introduce Calligrapher, a novel diffusion-based framework that innovatively integrates advanced text customization with artistic typography for digital calligraphy and design applications. Addressing the challenges of precise style control and data dependency in typographic customization, our framework incorporates three key technical contributions. First, we develop a self-distillation mechanism that leverages the pre-trained text-to-image generative model itself alongside the large language model to automatically construct a style-centric typography benchmark. Second, we introduce a localized style injection framework via a trainable style encoder, which comprises both Qformer and linear layers, to extract robust style features from reference images. An in-context generation mechanism is also employed to directly embed reference images into the denoising process, further enhancing the refined alignment of target styles. Extensive quantitative and qualitative evaluations across diverse fonts and design contexts confirm Calligrapher's accurate reproduction of intricate stylistic details and precise glyph positioning. By automating high-quality, visually consistent typography, Calligrapher surpasses traditional models, empowering creative practitioners in digital art, branding, and contextual typographic design.
Problem

Research questions and friction points this paper is trying to address.

Precise style control in typographic customization
Reducing data dependency for artistic typography
Automating high-quality, consistent digital calligraphy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-distillation mechanism for automatic typography benchmark
Localized style injection via trainable style encoder
In-context generation for refined style alignment
🔎 Similar Papers
No similar papers found.