Thunder-KoNUBench: A Corpus-Aligned Benchmark for Korean Negation Understanding

📅 2026-01-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses the poor performance of current large language models (LLMs) on Korean negation understanding, primarily attributed to the absence of evaluation benchmarks that reflect real-world linguistic distributions. To bridge this gap, the authors construct Thunder-KoNUBench, the first sentence-level benchmark for Korean negation comprehension, grounded in large-scale corpus analysis. They systematically evaluate 47 LLMs on this benchmark, revealing the nuanced effects of model scale and instruction tuning on negation understanding. Furthermore, they demonstrate that targeted fine-tuning on Thunder-KoNUBench not only substantially improves models’ ability to handle Korean negation but also enhances their general contextual comprehension. This benchmark thus provides a critical resource and a new direction for advancing Korean language understanding research.

Technology Category

Application Category

📝 Abstract

Although negation is known to challenge large language models (LLMs), benchmarks for evaluating negation understanding, especially in Korean, are scarce. We conduct a corpus-based analysis of Korean negation and show that LLM performance degrades under negation. We then introduce Thunder-KoNUBench, a sentence-level benchmark that reflects the empirical distribution of Korean negation phenomena. Evaluating 47 LLMs, we analyze the effects of model size and instruction tuning, and show that fine-tuning on Thunder-KoNUBench improves negation understanding and broader contextual comprehension in Korean.

Problem

Research questions and friction points this paper is trying to address.

negation understanding

Korean language

large language models

benchmark

corpus-aligned

Innovation

Methods, ideas, or system contributions that make the work stand out.

Korean negation

corpus-aligned benchmark

large language models