Language-specific Neurons Do Not Facilitate Cross-Lingual Transfer

📅 2025-03-21

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work investigates whether identifying and intervening on language-specific neurons can enhance cross-lingual transfer in multilingual large language models (MLLMs) for low-resource languages. Building on Llama 3.1 and Mistral Nemo, we employ Language Activation Probability Entropy and activation-threshold-based methods to identify language-specific neurons, followed by neuron-level LoRA fine-tuning. We systematically evaluate efficacy on XNLI and XQuAD. Contrary to prevailing assumptions, our empirical study—first of its kind—reveals that existing neuron identification techniques fail to reliably localize units critical for cross-lingual generalization. Consequently, targeted intervention consistently degrades, rather than improves, performance on low-resource languages. These findings challenge the dominant “interpretability-driven cross-lingual optimization” paradigm, providing crucial negative evidence. They underscore fundamental limitations in current neuron-level analysis and offer critical insights for rethinking MLLM architecture design and intervention strategies.

Technology Category

Application Category

📝 Abstract

Multilingual large language models (LLMs) aim towards robust natural language understanding across diverse languages, yet their performance significantly degrades on low-resource languages. This work explores whether existing techniques to identify language-specific neurons can be leveraged to enhance cross-lingual task performance of lowresource languages. We conduct detailed experiments covering existing language-specific neuron identification techniques (such as Language Activation Probability Entropy and activation probability-based thresholding) and neuron-specific LoRA fine-tuning with models like Llama 3.1 and Mistral Nemo. We find that such neuron-specific interventions are insufficient to yield cross-lingual improvements on downstream tasks (XNLI, XQuAD) in lowresource languages. This study highlights the challenges in achieving cross-lingual generalization and provides critical insights for multilingual LLMs.

Problem

Research questions and friction points this paper is trying to address.

Enhancing cross-lingual task performance for low-resource languages

Evaluating language-specific neuron identification techniques

Assessing neuron-specific interventions in multilingual LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-specific neuron identification techniques

Neuron-specific LoRA fine-tuning

Cross-lingual task performance evaluation

🔎 Similar Papers

Large Language Models Are Cross-Lingual Knowledge-Free Reasoners