FontCrafter: High-Fidelity Element-Driven Artistic Font Creation with Visual In-Context Generation

๐Ÿ“… 2026-03-23
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes a visual contextโ€“driven framework for artistic font generation that overcomes limitations in style diversity and fine-grained control inherent in existing methods. By treating element images as visual contexts, the approach leverages an image inpainting model to transfer styles at the pixel level into glyph regions. A lightweight Context-aware Mask Adapter is introduced to inject shape information, enhancing structural control. Notably, the method is the first to distinguish between object-like and amorphous elements and incorporates a training-free attention redirection mechanism for region-aware style modulation. Additionally, edge redrawing is employed to improve boundary naturalness. Evaluated under a zero-shot setting, the framework achieves superior fidelity in both structure and texture, enables flexible style mixing, and demonstrates state-of-the-art performance on the ElementFont dataset.

Technology Category

Application Category

๐Ÿ“ Abstract
Artistic font generation aims to synthesize stylized glyphs based on a reference style. However, existing approaches suffer from limited style diversity and coarse control. In this work, we explore the potential of element-driven artistic font generation. Elements are the fundamental visual units of a font, serving as reference images for the desired style. Conceptually, we categorize elements into object elements (e.g., flowers or stones) with distinct structures and amorphous elements (e.g., flames or clouds) with unstructured textures. We introduce FontCrafter, an element-driven framework for font creation, and construct a large-scale dataset, ElementFont, which contains diverse element types and high-quality glyph images. However, achieving high-fidelity reconstruction of both texture and structure of reference elements remains challenging. To address this, we propose an in-context generation strategy that treats element images as visual context and uses an inpainting model to transfer element styles into glyph regions at the pixel level. To further control glyph shapes, we design a lightweight Context-aware Mask Adapter (CMA) that injects shape information. Moreover, a training-free attention redirection mechanism enables region-aware style control and suppresses stroke hallucination. In addition, edge repainting is applied to make boundaries more natural. Extensive experiments demonstrate that FontCrafter achieves strong zero-shot generation performance, particularly in preserving structural and textural fidelity, while also supporting flexible controls such as style mixture.
Problem

Research questions and friction points this paper is trying to address.

artistic font generation
style diversity
high-fidelity reconstruction
element-driven
visual context
Innovation

Methods, ideas, or system contributions that make the work stand out.

element-driven font generation
visual in-context learning
inpainting-based style transfer
context-aware mask adapter
attention redirection
๐Ÿ”Ž Similar Papers
No similar papers found.
W
Wuyang Luo
Dalian University of Technology
C
Chengkai Tan
Dalian University of Technology
Chang Ge
Chang Ge
University of Minnesota
DatabasesData CleaningData PrivacyData Security
B
Binye Hong
Dalian University of Technology
Su Yang
Su Yang
Fudan University
Social ComputingUrban MobilityPattern Recognition
Y
Yongjiu Ma
Dalian University of Technology