Could Thinking Multilingually Empower LLM Reasoning?

📅 2025-04-16

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

This paper identifies a previously overlooked “English bias” in large language models (LLMs) for reasoning tasks: non-English inputs (e.g., Chinese, Spanish) can significantly elevate the upper bound of reasoning performance. Method: The authors systematically characterize the performance ceiling of multilingual reasoning, propose multilingual prompt engineering, cross-lingual attribution analysis, translation robustness evaluation, and a novel answer aggregation mechanism. They rigorously compare against standard answer selection methods. Contribution/Results: Multilingual reasoning achieves an average Acc@k gain of nearly 10 points over monolingual English reasoning, exhibiting strong robustness across varying translation quality and language choices. Empirical analysis demonstrates that conventional English-centric answer selection strategies fail to reach this ceiling due to inherent linguistic bias. The work provides theoretical grounding, empirical evidence, and scalable technical pathways to transcend the English-centered paradigm in LLM reasoning.

Technology Category

Application Category

📝 Abstract

Previous work indicates that large language models exhibit a significant"English bias", i.e. they often perform better when tasks are presented in English. Interestingly, we have observed that using certain other languages in reasoning tasks can yield better performance than English. However, this phenomenon remains under-explored. In this paper, we explore the upper bound of harnessing multilingualism in reasoning tasks, suggesting that multilingual reasoning promises significantly (by nearly 10 Acc@$k$ points) and robustly (tolerance for variations in translation quality and language choice) higher upper bounds than English-only reasoning. Besides analyzing the reason behind the upper bound and challenges in reaching it, we also find that common answer selection methods cannot achieve this upper bound, due to their limitations and biases. These insights could pave the way for future research aimed at fully harnessing the potential of multilingual reasoning in LLMs.

Problem

Research questions and friction points this paper is trying to address.

Exploring multilingual reasoning's upper bound in LLMs

Analyzing limitations of English-only reasoning in LLMs

Identifying biases in common answer selection methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual reasoning boosts LLM performance

Upper bound analysis of multilingual reasoning benefits

Overcoming biases in answer selection methods

🔎 Similar Papers

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models