How Good is BLI as an Alignment Measure: A Study in Word Embedding Paradigm

📅 2025-11-17

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This study systematically evaluates bilingual lexicon induction (BLI) as a metric for assessing embedding space alignment quality, identifying its validity and limitations. Addressing diverse language pairs—including high- and low-resource settings and typologically distinct language families—the authors compare BLI performance across traditional linear alignment methods, multilingual pretrained models (e.g., mBERT, XLM-R), and novel compositional alignment strategies. They propose a stem-based BLI approach to improve matching accuracy for inflectional languages and introduce a more robust lexical pruning mechanism. Results show that compositional methods generally achieve the best BLI performance, while multilingual models exhibit pronounced advantages for low-resource languages. Crucially, standard BLI is found to be sensitive to vocabulary coverage bias and morphological noise, thus failing to reliably reflect true alignment quality. The work establishes a more rigorous methodological framework for evaluating embedding alignment, supported by extensive empirical analysis across linguistic settings.

Technology Category

Application Category

📝 Abstract

Sans a dwindling number of monolingual embedding studies originating predominantly from the low-resource domains, it is evident that multilingual embedding has become the de facto choice due to its adaptability to the usage of code-mixed languages, granting the ability to process multilingual documents in a language-agnostic manner, as well as removing the difficult task of aligning monolingual embeddings. But is this victory complete? Are the multilingual models better than aligned monolingual models in every aspect? Can the higher computational cost of multilingual models always be justified? Or is there a compromise between the two extremes? Bilingual Lexicon Induction is one of the most widely used metrics in terms of evaluating the degree of alignment between two embedding spaces. In this study, we explore the strengths and limitations of BLI as a measure to evaluate the degree of alignment of two embedding spaces. Further, we evaluate how well traditional embedding alignment techniques, novel multilingual models, and combined alignment techniques perform BLI tasks in the contexts of both high-resource and low-resource languages. In addition to that, we investigate the impact of the language families to which the pairs of languages belong. We identify that BLI does not measure the true degree of alignment in some cases and we propose solutions for them. We propose a novel stem-based BLI approach to evaluate two aligned embedding spaces that take into account the inflected nature of languages as opposed to the prevalent word-based BLI techniques. Further, we introduce a vocabulary pruning technique that is more informative in showing the degree of the alignment, especially performing BLI on multilingual embedding models. Often, combined embedding alignment techniques perform better while in certain cases multilingual embeddings perform better (mainly low-resource language cases).

Problem

Research questions and friction points this paper is trying to address.

Evaluating BLI as a measure for embedding space alignment quality

Comparing performance of traditional, multilingual and combined alignment techniques

Assessing alignment methods across high-resource and low-resource language pairs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes stem-based BLI approach for inflected languages

Introduces vocabulary pruning technique for alignment evaluation

Compares traditional and multilingual embedding alignment methods

🔎 Similar Papers

Revisiting Word Embeddings in the LLM Era