Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis

📅 2024-09-22

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Large language models (LLMs) exhibit weak and imbalanced structural knowledge representation—particularly for part-of-speech (POS) and dependency relations—across low-resource versus high-resource languages. Method: We propose a cross-lingual linear probing framework to systematically analyze the distribution and layer-wise evolution of syntactic knowledge in multilingual LLM representations. Our analysis spans high- and low-resource language groups, employing layer-wise accuracy tracking, cosine similarity measurements, and multilingual comparative experiments. Contribution/Results: We identify three systematic disparities for the first time: (1) significantly lower probing accuracy on low-resource languages; (2) diminished gains from deeper layers and flatter layer-wise accuracy trends; and (3) substantially reduced intra- and cross-group representation similarity. These findings reveal that structural knowledge transfer capability is critically resource-dependent—a core bottleneck in multilingual modeling. Our work provides empirical grounding and concrete directions for developing more equitable multilingual representation learning strategies.

Technology Category

Application Category

📝 Abstract

Probing techniques for large language models (LLMs) have primarily focused on English, overlooking the vast majority of the world's languages. In this paper, we extend these probing methods to a multilingual context, investigating the behaviors of LLMs across diverse languages. We conduct experiments on several open-source LLM models, analyzing probing accuracy, trends across layers, and similarities between probing vectors for multiple languages. Our key findings reveal: (1) a consistent performance gap between high-resource and low-resource languages, with high-resource languages achieving significantly higher probing accuracy; (2) divergent layer-wise accuracy trends, where high-resource languages show substantial improvement in deeper layers similar to English; and (3) higher representational similarities among high-resource languages, with low-resource languages demonstrating lower similarities both among themselves and with high-resource languages. These results highlight significant disparities in LLMs' multilingual capabilities and emphasize the need for improved modeling of low-resource languages.

Problem

Research questions and friction points this paper is trying to address.

Multilingual Language Models

Resource-imbalanced Languages

Accuracy Disparity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual Language Models

Resource Disparity Impact

Cross-lingual Relationships

🔎 Similar Papers

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models