Entropy-Aware Structural Alignment for Zero-Shot Handwritten Chinese Character Recognition

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the insufficient visual-semantic alignment in existing zero-shot handwritten Chinese character recognition methods, which often overlook the hierarchical structure of characters and the varying information density of their components. To this end, we propose a structure-aware alignment framework grounded in information entropy modeling. Specifically, we introduce entropy-aware positional encoding to dynamically modulate positional embeddings, construct a dual-granularity radical tree to explicitly capture the hierarchical composition of Chinese characters, and design a Top-K semantic neighbor centroid fusion mechanism to enhance semantic discriminability. Integrated within the CLIP architecture, our approach effectively mitigates visual ambiguity and achieves state-of-the-art performance under both zero-shot and few-shot settings, significantly outperforming existing baselines while demonstrating superior data efficiency and generalization capability.

Technology Category

Application Category

📝 Abstract
Zero-shot Handwritten Chinese Character Recognition (HCCR) aims to recognize unseen characters by leveraging radical-based semantic compositions. However, existing approaches often treat characters as flat radical sequences, neglecting the hierarchical topology and the uneven information density of different components. To address these limitations, we propose an Entropy-Aware Structural Alignment Network that bridges the visual-semantic gap through information-theoretic modeling. First, we introduce an Information Entropy Prior to dynamically modulate positional embeddings via multiplicative interaction, acting as a saliency detector that prioritizes discriminative roots over ubiquitous components. Second, we construct a Dual-View Radical Tree to extract multi-granularity structural features, which are integrated via an adaptive Sigmoid-based gating network to encode both global layout and local spatial roles. Finally, a Top-K Semantic Feature Fusion mechanism is devised to augment the decoding process by utilizing the centroid of semantic neighbors, effectively rectifying visual ambiguities through feature-level consensus. Extensive experiments demonstrate that our method establishes new state-of-the-art performance, achieving an accuracy of 55.04\% on the ICDAR 2013 dataset ($m=1500$), significantly outperforming existing CLIP-based baselines in the challenging zero-shot setting. Furthermore, the framework exhibits exceptional data efficiency, demonstrating rapid adaptability with minimal support samples, achieving 92.41\% accuracy with only one support sample per class.
Problem

Research questions and friction points this paper is trying to address.

Zero-Shot Handwritten Chinese Character Recognition
Structural Alignment
Information Entropy
Radical Composition
Visual-Semantic Gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

Entropy-Aware
Structural Alignment
Zero-Shot HCCR
Radical Tree
Semantic Feature Fusion
🔎 Similar Papers
No similar papers found.
Q
Qiuming Luo
College of Computer Science and Software Engineering, Shenzhen University, Nanshan, Shenzhen, 518060, Guangdong, China; Shenzhen Key Laboratory of Embedded System Design, Nanshan, Shenzhen, 518060, Guangdong, China
T
Tao Zeng
College of Computer Science and Software Engineering, Shenzhen University, Nanshan, Shenzhen, 518060, Guangdong, China
F
Feng Li
Undergraduate School of Artificial Intelligence, Shenzhen Polytechnic University, Nanshan, Shenzhen, 518055, Shenzhen, Guangdong, China
H
Heming Liu
Undergraduate School of Artificial Intelligence, Shenzhen Polytechnic University, Nanshan, Shenzhen, 518055, Shenzhen, Guangdong, China
Rui Mao
Rui Mao
Nanyang Technological University
Computational LinguisticsCognitive ComputingMetaphorQuantitative FinanceNeurosymbolic AI
Chang Kong
Chang Kong
Shenzhen Polytechnic University
Computer VisionArtificial IntelligenceMachine LearningHigh Performace Computing