Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work addresses the high computational cost of existing cross-lingual chain-of-thought methods, which suffer from inefficient full-trajectory sampling and ineffective pruning due to linguistic representation disparities. To overcome these limitations, we propose UL-XCoT, a framework that dynamically selects a small subset of candidate languages within a language-invariant unified logical space. It enables real-time monitoring of reasoning trajectories and early pruning of low-quality paths, followed by voting-based aggregation of high-quality results. UL-XCoT achieves, for the first time, simultaneous compression in both the number of languages and generated tokens. Evaluated on PolyMath (18 languages) and MMLU-ProX-Lite (29 languages), our method maintains competitive accuracy while reducing token consumption by over 50%, substantially improving inference efficiency and robustness for low-resource languages.

Technology Category

Application Category

📝 Abstract

Cross-lingual chain-of-thought (XCoT) with self-consistency markedly enhances multilingual reasoning, yet existing methods remain costly due to extensive sampling of full trajectories across languages. Moreover, multilingual LLM representations vary strongly by language, hindering direct feature comparisons and effective pruning. Motivated by this, we introduce UL-XCoT, the first efficient unified logic cross-lingual reasoning framework that minimizes redundancy in token usage and latency, yielding the greatest efficiency under limited sampling budgets during inference. Specifically, UL-XCoT (1) achieves less languages by selecting, per query, a small candidate language set in a language-invariant unified logic space, (2) enables less tokens by monitoring logic-space trajectory dynamics during decoding to prune low-quality reasoning paths, and (3) aggregates the remaining high-quality trajectories via voting. Experiments on PolyMath across 18 languages and MMLU-ProX-Lite across 29 languages with DeepSeek-R1-DistillQwen-7B demonstrate that UL-XCoT achieves competitive accuracy while sharply cutting over 50% decoding token cost versus prior sampling baselines. UL-XCoT also delivers more stable gains on low-resource languages, underscoring consistently superior robustness where standard XCoT self-consistency method fails.

Problem

Research questions and friction points this paper is trying to address.

cross-lingual chain-of-thought

multilingual reasoning

token efficiency

language representation disparity

low-resource languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-lingual chain-of-thought

unified logic space

trajectory pruning