Semantic Pivots Enable Cross-Lingual Transfer in Large Language Models

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This study investigates the intrinsic mechanisms underlying large language models’ (LLMs) cross-lingual capabilities. To this end, we propose a word-level cross-lingual translation task and employ intermediate-layer activation tracking to identify, for the first time, two distinct cross-lingual reasoning patterns: “co-occurrence-driven” versus “semantic-hub-driven” inference. Leveraging pretraining data, we mine semantic hubs—word pairs exhibiting both high cross-lingual co-occurrence frequency and semantic stability—and design a semantic-hub-aware pretraining corpus reconstruction method. Our approach significantly improves cross-lingual generalization, yielding an average +2.3 percentage point gain on multilingual understanding benchmarks including XNLI and XCOPA. Empirical results confirm semantic hubs serve as a critical mediating mechanism for cross-lingual transfer. The core contributions are: (1) uncovering the structural role of semantic hubs in shaping LLMs’ cross-lingual competence; and (2) introducing a scalable, data-driven corpus reconstruction paradigm grounded in linguistic regularities.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) demonstrate remarkable ability in cross-lingual tasks. Understanding how LLMs acquire this ability is crucial for their interpretability. To quantify the cross-lingual ability of LLMs accurately, we propose a Word-Level Cross-Lingual Translation Task. To find how LLMs learn cross-lingual ability, we trace the outputs of LLMs' intermediate layers in the word translation task. We identify and distinguish two distinct behaviors in the forward pass of LLMs: co-occurrence behavior and semantic pivot behavior. We attribute LLMs' two distinct behaviors to the co-occurrence frequency of words and find the semantic pivot from the pre-training dataset. Finally, to apply our findings to improve the cross-lingual ability of LLMs, we reconstruct a semantic pivot-aware pre-training dataset using documents with a high proportion of semantic pivots. Our experiments validate the effectiveness of our approach in enhancing cross-lingual ability. Our research contributes insights into the interpretability of LLMs and offers a method for improving LLMs' cross-lingual ability.

Problem

Research questions and friction points this paper is trying to address.

Quantify LLMs' cross-lingual ability via translation task

Identify co-occurrence and semantic pivot behaviors in LLMs

Enhance cross-lingual ability using semantic pivot-aware datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Propose Word-Level Cross-Lingual Translation Task

Identify co-occurrence and semantic pivot behaviors

Reconstruct semantic pivot-aware pre-training dataset

🔎 Similar Papers

No similar papers found.