Thunder-KoNUBench: A Corpus-Aligned Benchmark for Korean Negation Understanding

πŸ“… 2026-01-08
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the poor performance of current large language models (LLMs) on Korean negation understanding, primarily attributed to the absence of evaluation benchmarks that reflect real-world linguistic distributions. To bridge this gap, the authors construct Thunder-KoNUBench, the first sentence-level benchmark for Korean negation comprehension, grounded in large-scale corpus analysis. They systematically evaluate 47 LLMs on this benchmark, revealing the nuanced effects of model scale and instruction tuning on negation understanding. Furthermore, they demonstrate that targeted fine-tuning on Thunder-KoNUBench not only substantially improves models’ ability to handle Korean negation but also enhances their general contextual comprehension. This benchmark thus provides a critical resource and a new direction for advancing Korean language understanding research.

Technology Category

Application Category

πŸ“ Abstract
Although negation is known to challenge large language models (LLMs), benchmarks for evaluating negation understanding, especially in Korean, are scarce. We conduct a corpus-based analysis of Korean negation and show that LLM performance degrades under negation. We then introduce Thunder-KoNUBench, a sentence-level benchmark that reflects the empirical distribution of Korean negation phenomena. Evaluating 47 LLMs, we analyze the effects of model size and instruction tuning, and show that fine-tuning on Thunder-KoNUBench improves negation understanding and broader contextual comprehension in Korean.
Problem

Research questions and friction points this paper is trying to address.

negation understanding
Korean language
large language models
benchmark
corpus-aligned
Innovation

Methods, ideas, or system contributions that make the work stand out.

Korean negation
corpus-aligned benchmark
large language models
fine-tuning
negation understanding
πŸ”Ž Similar Papers
No similar papers found.
S
Sungmok Jung
Graduate School of Data Science, Seoul National University
Y
Yeonkyoung So
Graduate School of Data Science, Seoul National University
J
Joonhak Lee
Graduate School of Data Science, Seoul National University
Sangho Kim
Sangho Kim
Associate Professor of Biomedical Engineering, National University of Singapore
Blood RheologyMicrocirculationHemodynamicsGas Transport
Y
Yelim Ahn
Graduate School of Data Science, Seoul National University
Jaejin Lee
Jaejin Lee
Dept. of Compter Science and Engineering, Seoul National University
Parallel processingCompilersComputer architecturesOperating systemsHeterogeneous computing