The Impact of Visual Information in Chinese Characters: Evaluating Large Models' Ability to Recognize and Utilize Radicals

📅 2024-10-11
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether large language models (LLMs) and vision-language models (VLMs) can perceive and leverage the visual substructures of Chinese characters—such as radicals, strokes, and character composition—to enhance Chinese language understanding. Method: We introduce the first fine-grained benchmark for Chinese character visual-semantic understanding, covering radical identity, structural patterns, and stroke count; propose a prompt-engineering-based zero-shot and few-shot evaluation framework; and design multimodal prompts integrating character images with textual descriptions. Contribution/Results: Despite lacking explicit radical knowledge, mainstream LLMs and VLMs exhibit latent radical awareness that can be effectively elicited via textual prompting alone. Incorporating radical information consistently and significantly improves part-of-speech tagging accuracy. This work provides the first systematic empirical validation of the transferable utility of Chinese character visual cues for large-model Chinese language processing.

Technology Category

Application Category

📝 Abstract
The glyphic writing system of Chinese incorporates information-rich visual features in each character, such as radicals that provide hints about meaning or pronunciation. However, there has been no investigation into whether contemporary Large Language Models (LLMs) and Vision-Language Models (VLMs) can harness these sub-character features in Chinese through prompting. In this study, we establish a benchmark to evaluate LLMs' and VLMs' understanding of visual elements in Chinese characters, including radicals, composition structures, strokes, and stroke counts. Our results reveal that models surprisingly exhibit some, but still limited, knowledge of the visual information, regardless of whether images of characters are provided. To incite models' ability to use radicals, we further experiment with incorporating radicals into the prompts for Chinese language processing (CLP) tasks. We observe consistent improvement in Part-Of-Speech tagging when providing additional information about radicals, suggesting the potential to enhance CLP by integrating sub-character information.
Problem

Research questions and friction points this paper is trying to address.

Evaluating large models' recognition of Chinese character radicals
Assessing models' ability to utilize visual sub-character features
Investigating radical integration for Chinese language processing improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmark evaluates visual element understanding in Chinese characters
Incorporating radicals into prompts improves Part-Of-Speech tagging
Integrating sub-character information enhances Chinese language processing
🔎 Similar Papers
No similar papers found.
X
Xiaofeng Wu
Georgia Institute of Technology
Karl Stratos
Karl Stratos
Apple AI/ML
Natural Language ProcessingDeep Learning
W
Wei Xu
Georgia Institute of Technology