Cross-Lingual Generalization and Compression: From Language-Specific to Shared Neurons

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This study investigates how language-specific representations in multilingual large language models (MLLMs) spontaneously evolve into cross-lingual abstract representations during pretraining—without explicit cross-lingual supervision. We propose a multi-scale interpretability framework integrating parameter-space analysis, layer-wise probing, and neuron trajectory tracking. Empirically, we identify that semantic alignment concentrates in specific subsets of neurons at middle-to-high layers and discover cross-lingually stable “concept-encoding neurons” that generalize across languages. Our results reveal a hierarchical compression pathway—from language-specific to cross-lingual abstraction—and quantitatively characterize the dynamic coupling between declining language identification capability and progressive semantic alignment across layers. Key contributions are: (1) precise localization of the neural layers critical for cross-lingual alignment; (2) discovery of robust, shared-concept neurons exhibiting strong cross-lingual generalization; and (3) establishment of a principled, multi-scale interpretability paradigm for analyzing representation evolution in MLLMs.

Technology Category

Application Category

📝 Abstract

Multilingual language models (MLLMs) have demonstrated remarkable abilities to transfer knowledge across languages, despite being trained without explicit cross-lingual supervision. We analyze the parameter spaces of three MLLMs to study how their representations evolve during pre-training, observing patterns consistent with compression: models initially form language-specific representations, which gradually converge into cross-lingual abstractions as training progresses. Through probing experiments, we observe a clear transition from uniform language identification capabilities across layers to more specialized layer functions. For deeper analysis, we focus on neurons that encode distinct semantic concepts. By tracing their development during pre-training, we show how they gradually align across languages. Notably, we identify specific neurons that emerge as increasingly reliable predictors for the same concepts across languages.

Problem

Research questions and friction points this paper is trying to address.

Analyze how multilingual models evolve cross-lingual representations during pre-training

Study transition from language-specific to shared neuron functions in MLLMs

Identify neurons predicting semantic concepts consistently across multiple languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing MLLM parameter spaces evolution

Transition from language-specific to shared neurons

Identifying cross-lingual concept predictor neurons

🔎 Similar Papers

Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs