RoMath: A Mathematical Reasoning Benchmark in Romanian

📅 2024-09-17
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing mathematical reasoning benchmarks are heavily English-centric, neglecting the needs of low-resource languages. This work addresses the critical gap in formal and informal mathematical text understanding for Romanian—a morphologically rich, low-resource language—where current models exhibit significant deficiencies. Method: We introduce RoMath, the first specialized, multi-level evaluation benchmark for Romanian mathematical reasoning, comprising three data categories: national high-school graduation exams, mathematics competitions, and controllably synthesized problems. RoMath is deeply grounded in Romania’s educational curriculum and linguistic morphology, avoiding naive machine translation. Data construction combines expert-authored items with rule-guided synthetic generation to ensure high-quality, linguistically accurate annotations. Contribution/Results: Comprehensive evaluation of leading open-source LLMs reveals substantial performance deficits in Romanian mathematical reasoning. All code and datasets are publicly released, establishing a reproducible, localization-aware assessment infrastructure that challenges the English-centric paradigm and advances multilingual mathematical AI.

Technology Category

Application Category

📝 Abstract
Mathematics has long been conveyed through natural language, primarily for human understanding. With the rise of mechanized mathematics and proof assistants, there is a growing need to understand informal mathematical text, yet most existing benchmarks focus solely on English, overlooking other languages. This paper introduces RoMath, a Romanian mathematical reasoning benchmark suite comprising three datasets: RoMath-Baccalaureate, RoMath-Competitions and RoMath-Synthetic, which cover a range of mathematical domains and difficulty levels, aiming to improve non-English language models and promote multilingual AI development. By focusing on Romanian, a low-resource language with unique linguistic features, RoMath addresses the limitations of Anglo-centric models and emphasizes the need for dedicated resources beyond simple automatic translation. We benchmark several open-weight language models, highlighting the importance of creating resources for underrepresented languages. We make the code and dataset available.
Problem

Research questions and friction points this paper is trying to address.

Addresses lack of Romanian math benchmarks for AI
Improves non-English language model capabilities
Highlights need for multilingual math resources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces RoMath, a Romanian mathematical reasoning benchmark
Covers diverse math domains and difficulty levels
Focuses on low-resource languages beyond automatic translation
🔎 Similar Papers
No similar papers found.