Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address English-centric bias in mathematical word problem evaluation for low-resource languages—arising from insufficient localization of sociocultural entities (e.g., names, currencies, institutions)—this paper proposes the first socioculturally localized framework for math problems in low-resource languages. Leveraging large language models, the framework automates cultural entity replacement and contextual reconstruction, augmented by rule-based filtering and context-aware consistency verification to generate authentic, diverse, and high-quality localized datasets. Experiments demonstrate that the generated data significantly improves model robustness in local contexts, mitigates entity-level biases inherent in translated benchmarks, and reveals more accurate cross-lingual disparities in mathematical reasoning ability. This work establishes a new benchmark and methodology for fair, reliable, and culturally grounded multilingual mathematical reasoning evaluation.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated significant capabilities in solving mathematical problems expressed in natural language. However, multilingual and culturally-grounded mathematical reasoning in low-resource languages lags behind English due to the scarcity of socio-cultural task datasets that reflect accurate native entities such as person names, organization names, and currencies. Existing multilingual benchmarks are predominantly produced via translation and typically retain English-centric entities, owing to the high cost associated with human annotater-based localization. Moreover, automated localization tools are limited, and hence, truly localized datasets remain scarce. To bridge this gap, we introduce a framework for LLM-driven cultural localization of math word problems that automatically constructs datasets with native names, organizations, and currencies from existing sources. We find that translated benchmarks can obscure true multilingual math ability under appropriate socio-cultural contexts. Through extensive experiments, we also show that our framework can help mitigate English-centric entity bias and improves robustness when native entities are introduced across various languages.
Problem

Research questions and friction points this paper is trying to address.

Addressing scarcity of culturally-grounded math datasets in low-resource languages
Mitigating English-centric entity bias in multilingual mathematical reasoning
Automating socio-cultural localization of math problems using LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven framework for cultural localization
Automatically constructs datasets with native entities
Mitigates English-centric bias in multilingual benchmarks