On the Entity-Level Alignment in Crosslingual Consistency

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Multilingual large language models (LLMs) exhibit significant cross-lingual inconsistency in factual recall, primarily due to failure in cross-lingual entity alignment. Method: We identify entity alignment as the critical bottleneck for improving cross-lingual consistency and propose an English-hub translation–based prompting framework—SubSub (subject substitution) and SubInj (subject injection)—to explicitly enforce subject alignment within a shared conceptual space. Alignment quality is quantified via entity-level translation tasks, and internal representation analysis is integrated with prompt engineering to jointly optimize alignment fidelity and factual accuracy. Contribution/Results: Experiments across multiple multilingual benchmarks demonstrate substantial improvements in cross-lingual consistency (+12.3%) and factual recall accuracy (+9.8%). This work provides the first systematic empirical validation that entity alignment is a decisive factor governing multilingual reasoning performance in LLMs.

Technology Category

Application Category

📝 Abstract

Multilingual large language models (LLMs) are expected to recall factual knowledge consistently across languages. However, the factors that give rise to such crosslingual consistency -- and its frequent failure -- remain poorly understood. In this work, we hypothesize that these inconsistencies may arise from failures in entity alignment, the process of mapping subject and object entities into a shared conceptual space across languages. To test this, we assess alignment through entity-level (subject and object) translation tasks, and find that consistency is strongly correlated with alignment across all studied models, with misalignment of subjects or objects frequently resulting in inconsistencies. Building on this insight, we propose SubSub and SubInj, two effective methods that integrate English translations of subjects into prompts across languages, leading to substantial gains in both factual recall accuracy and consistency. Finally, our mechanistic analysis reveals that these interventions reinforce the entity representation alignment in the conceptual space through model's internal pivot-language processing, offering effective and practical strategies for improving multilingual factual prediction.

Problem

Research questions and friction points this paper is trying to address.

Analyzing crosslingual consistency failures in multilingual LLMs

Investigating entity alignment impact on factual knowledge recall

Developing methods to improve multilingual factual prediction consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Entity-level translation tasks assess crosslingual alignment

Integrating English subject translations into multilingual prompts

Reinforcing entity representation via pivot-language processing

🔎 Similar Papers

Unsupervised Robust Cross-Lingual Entity Alignment via Neighbor Triple Matching with Entity and Relation Texts