Making Sense of Korean Sentences: A Comprehensive Evaluation of LLMs through KoSEnd Dataset

📅 2025-07-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates large language models’ (LLMs) ability to assess the naturalness of complex sentence-final forms in low-resource agglutinative languages—specifically Korean—where morphological richness and context sensitivity pose significant challenges for LLMs. Method: We introduce KoSEnd, the first Korean dataset explicitly designed for sentence-final form naturalness evaluation, comprising 3,000 utterances drawn from diverse, authentic contexts and rigorously annotated by linguistically trained human raters. We systematically evaluate 11 LLMs across parameter scales and prediction consistency metrics. Contribution/Results: Experiments reveal pervasive limitations in LLMs’ naturalness discrimination capability. To address this, we propose an explicit prompting strategy—“missing final-form prompting”—that incorporates morphological knowledge into inference. This linguistically grounded approach yields substantial performance gains, demonstrating that explicit modeling of agglutinative morphology is critical for effective LLM adaptation to low-resource languages. Our work establishes a new benchmark, methodology, and empirical foundation for evaluating and improving LLMs on agglutinative languages.

Technology Category

Application Category

📝 Abstract
Although LLMs have made significant progress in various languages, there are still concerns about their effectiveness with low-resource agglutinative languages compared to languages such as English. In this study, we focused on Korean, a language known for its complex sentence endings, and evaluated LLMs on this challenging aspect. We introduce the Korean Sentence Endings (KoSEnd) dataset, which includes 3,000 sentences, each annotated for the naturalness of 15 sentence ending forms. These were collected from diverse sources to cover a range of contexts. We evaluated 11 LLMs to assess their understanding of Korean sentence endings, analyzing them based on parameter count and prediction consistency. Notably, we found that informing models about the possibility of missing sentence endings improved performance, highlighting the impact of explicitly considering certain linguistic features.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs on Korean sentence endings complexity
Assessing LLM performance on low-resource agglutinative languages
Improving model accuracy with linguistic feature awareness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing KoSEnd dataset for Korean evaluation
Evaluating 11 LLMs on sentence endings
Improving performance by informing missing endings
🔎 Similar Papers
No similar papers found.
S
Seunguk Yu
Chung-Ang University, Seoul, Republic of Korea
K
Kyeonghyun Kim
Chung-Ang University, Seoul, Republic of Korea
J
Jungmin Yun
Chung-Ang University, Seoul, Republic of Korea
Youngbin Kim
Youngbin Kim
Senior Researcher, ETRI (Electronics and Telecommunications Research Institute)