The Reasoning Lingua Franca: A Double-Edged Sword for Multilingual AI

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large reasoning models (LRMs) exhibit a pervasive “English-default” bias in multilingual reasoning, undermining cultural contextual understanding and interpretability. This work presents the first systematic investigation of cross-lingual cognitive reasoning behavior, analyzing inference pathways across languages on the MGSM and GPQA Diamond benchmarks. We find that while English-based reasoning significantly improves overall accuracy—especially on complex tasks—it introduces a novel error class: *translation drift*, wherein critical semantic content from the source language is distorted during translation into English prior to reasoning. Our analysis reveals a fundamental tension between linguistic translation and reasoning capability, demonstrating that accuracy alone is insufficient for evaluating multilingual reasoning. We argue that semantic fidelity must be jointly optimized with correctness, and provide cognitive-level evidence supporting the development of truly multilingual-native reasoning models—models that reason natively in diverse languages without mandatory English mediation.

Technology Category

Application Category

📝 Abstract
Large Reasoning Models (LRMs) achieve strong performance on mathematical, scientific, and other question-answering tasks, but their multilingual reasoning abilities remain underexplored. When presented with non-English questions, LRMs often default to reasoning in English, raising concerns about interpretability and the handling of linguistic and cultural nuances. We systematically compare an LRM's reasoning in English versus the language of the question. Our evaluation spans two tasks: MGSM and GPQA Diamond. Beyond measuring answer accuracy, we also analyze cognitive attributes in the reasoning traces. We find that English reasoning traces exhibit a substantially higher presence of these cognitive behaviors, and that reasoning in English generally yields higher final-answer accuracy, with the performance gap increasing as tasks become more complex. However, this English-centric strategy is susceptible to a key failure mode - getting "Lost in Translation," where translation steps lead to errors that would have been avoided by question's language reasoning.
Problem

Research questions and friction points this paper is trying to address.

Exploring multilingual reasoning abilities in Large Reasoning Models
Analyzing performance gap between English and native language reasoning
Identifying translation-induced errors in non-English question processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluated multilingual reasoning performance across languages
Analyzed cognitive behaviors in reasoning traces systematically
Identified translation-induced errors in English-centric reasoning
🔎 Similar Papers
No similar papers found.