The Impact of Language Mixing on Bilingual LLM Reasoning

📅 2025-07-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the nature and impact of code-switching in Chinese–English bilingual large language models during reasoning. We find that language switching is not a byproduct of training but a strategic cognitive behavior—particularly beneficial for mathematical reasoning. Methodologically, we first identify reinforcement learning (specifically reward-driven RLVR) as the critical phase triggering code-switching; then, we design a lightweight probe model to accurately predict the utility of language switching and guide decoding accordingly. Results show that enforcing monolingual decoding degrades mathematical reasoning accuracy by 5.6 percentage points, whereas our probe-guided code-switching strategy improves accuracy by up to 6.25 points. This work provides the first systematic account of the cognitive mechanisms and practical value of code-switching in bilingual reasoning, establishing a novel paradigm for modeling multilingual reasoning processes.

Technology Category

Application Category

📝 Abstract
Proficient multilingual speakers often intentionally switch languages in the middle of a conversation. Similarly, recent reasoning-focused bilingual large language models (LLMs) with strong capabilities in both languages exhibit language mixing--alternating languages within their chain of thought. Discouraging this behavior in DeepSeek-R1 was found to degrade accuracy, suggesting that language mixing may benefit reasoning. In this work, we study language switching in Chinese-English bilingual reasoning models. We identify reinforcement learning with verifiable rewards (RLVR) as the critical training stage that leads to language mixing. We demonstrate that language mixing can enhance reasoning: enforcing monolingual decoding reduces accuracy by 5.6 percentage points on math reasoning tasks. Additionally, a lightweight probe can be trained to predict whether a potential language switch would benefit or harm reasoning, and when used to guide decoding, increases accuracy by up to 6.25 percentage points. Our findings suggest that language mixing is not merely a byproduct of multilingual training, but is a strategic reasoning behavior.
Problem

Research questions and friction points this paper is trying to address.

Study impact of language mixing on bilingual LLM reasoning
Identify RLVR training stage causing language switching
Show language mixing enhances reasoning accuracy strategically
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reinforcement learning with verifiable rewards
Trains lightweight probe to predict language switches
Enhances reasoning via strategic language mixing
🔎 Similar Papers
No similar papers found.