mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVR

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of high-quality, challenging multilingual mathematical datasets for Reinforcement Learning with Verifiable Rewards (RLVR), which has hindered effective training of large language models in non-English contexts. To bridge this gap, the authors construct and publicly release a multilingual math problem dataset spanning 14 languages, each containing over 10,000 expertly translated, manually curated, and difficulty-aligned problems derived from the AceReason-Math corpus—specifically designed for RLVR. This dataset provides the first set of high-difficulty, multilingual training signals optimized for RLVR, thereby filling a critical void in high-quality supervised data for non-English settings and significantly advancing multilingual RLVR research and benchmarking.

Technology Category

Application Category

📝 Abstract
Reinforcement Learning with Verifiable Rewards (RLVR) has been successfully applied to significantly boost the capabilities of pretrained large language models, especially in the math and logic problem domains. However, current research and available training datasets remain English-centric. While mul- tilingual training data and benchmarks have been created in the past, they were not created with RLVR and current model capability in mind, and their level of difficulty is often too low to provide appropriate training signals for current models. To address this gap, we provide mAceReason-Math, a dataset of high-quality translations of challenging math problems sourced from a corpus specifically curated for RLVR (AceReason-Math). We further take specific care to clean and improve our translations, resulting in a coverage of 14 languages with more than 10,000 samples per language. We release the dataset to facilitate multilingual RLVR research and benchmarking in the research community.
Problem

Research questions and friction points this paper is trying to address.

multilingual
math problems
RLVR
dataset
language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

multilingual dataset
Reinforcement Learning with Verifiable Rewards
mathematical reasoning
high-quality translation
RLVR
🔎 Similar Papers
No similar papers found.