🤖 AI Summary
This study addresses the lack of evaluation benchmarks for multimodal generative AI in low-resource language educational contexts—specifically Korean. To this end, we introduce KoNET, the first comprehensive Korean national education examination benchmark spanning elementary school through university. Methodologically, KoNET is systematically constructed based on Korea’s national curriculum standards and integrates four categories of standardized exam items. We propose a structured parsing pipeline, a cross-level knowledge coverage assessment, and a unified evaluation framework supporting open-source, closed-source, and API-based models, augmented by human error rate comparison. Key contributions include: (1) the first multimodal, multi-stage, multidisciplinary, and high-difficulty Korean education benchmark aligned with East Asian pedagogical systems; (2) empirical identification of capability gaps and subject-specific biases in state-of-the-art models for Korean educational reasoning; and (3) full open-sourcing of data, code, and construction tools—filling a critical gap in non-English educational AI evaluation and advancing AI research for low-resource language education.
📝 Abstract
This paper presents the Korean National Educational Test Benchmark (KoNET), a new benchmark designed to evaluate Multimodal Generative AI Systems using Korean national educational tests. KoNET comprises four exams: the Korean Elementary General Educational Development Test (KoEGED), Middle (KoMGED), High (KoHGED), and College Scholastic Ability Test (KoCSAT). These exams are renowned for their rigorous standards and diverse questions, facilitating a comprehensive analysis of AI performance across different educational levels. By focusing on Korean, KoNET provides insights into model performance in less-explored languages. We assess a range of models - open-source, open-access, and closed APIs - by examining difficulties, subject diversity, and human error rates. The code and dataset builder will be made fully open-sourced at https://github.com/naver-ai/KoNET.