Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

πŸ“… 2026-01-09
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing large language models and multimodal architectures struggle to effectively model the stroke-level structure of hieroglyphic scripts, while conventional approaches rely heavily on linguistic priors and incur high manual annotation costs. This work proposes HieroSA, a novel framework that, for the first time, automatically extracts stroke-level structural representations directly from character bitmaps without requiring linguistic priors or handcrafted annotations. By leveraging image processing and geometric normalization techniques, HieroSA generates interpretable sequences of line segments in a normalized coordinate space. The method overcomes script-specific limitations, enabling a universal and interpretable modeling approach applicable to both ancient and modern hieroglyphic systems. Experimental results demonstrate its ability to accurately capture intrinsic character structures and their semantic properties, offering a new computational tool for glyphographic analysis and facilitating cross-lingual transfer applications.

Technology Category

Application Category

πŸ“ Abstract
Hieroglyphs, as logographic writing systems, encode rich semantic and cultural information within their internal structural composition. Yet, current advanced Large Language Models (LLMs) and Multimodal LLMs (MLLMs) usually remain structurally blind to this information. LLMs process characters as textual tokens, while MLLMs additionally view them as raw pixel grids. Both fall short to model the underlying logic of character strokes. Furthermore, existing structural analysis methods are often script-specific and labor-intensive. In this paper, we propose Hieroglyphic Stroke Analyzer (HieroSA), a novel and generalizable framework that enables MLLMs to automatically derive stroke-level structures from character bitmaps without handcrafted data. It transforms modern logographic and ancient hieroglyphs character images into explicit, interpretable line-segment representations in a normalized coordinate space, allowing for cross-lingual generalization. Extensive experiments demonstrate that HieroSA effectively captures character-internal structures and semantics, bypassing the need for language-specific priors. Experimental results highlight the potential of our work as a graphematics analysis tool for a deeper understanding of hieroglyphic scripts. View our code at https://github.com/THUNLP-MT/HieroSA.
Problem

Research questions and friction points this paper is trying to address.

hieroglyphic scripts
stroke-level structural analysis
language-specific priors
logographic writing systems
character structure modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

stroke-level analysis
hieroglyphic scripts
multimodal LLMs
structure representation
cross-lingual generalization
Fuwen Luo
Fuwen Luo
Tsinghua University
Computer Science
Z
Zihao Wan
Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University
Z
Ziyue Wang
Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University
Y
Yaluo Liu
Rixin College, Tsinghua University
P
Pau Tong Lin Xu
Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University
X
Xuanjia Qiao
Department of Foreign Languages and Literatures, Tsinghua University
Xiaolong Wang
Xiaolong Wang
Tsinghua NLP
NLPAgentSubjective Tasks
Peng Li
Peng Li
HKUST| Tsinghua University
3d generationhuman reconstructiondepth estimation
Yang Liu
Yang Liu
Tsinghua University