🤖 AI Summary
In cross-lingual transfer of multilingual models, full-layer realignment often degrades performance—especially for typologically distant target languages—with pronounced deterioration in lower-layer parameters. To address this, we propose AlignFreeze: a method that selectively freezes either the bottom or top layers of Transformer encoders during realignment to preserve critical language-invariant and language-specific representations. Evaluated on four NLP tasks (including PoS tagging and NER) across 35 languages using the XLM-R architecture, AlignFreeze consistently improves cross-lingual robustness and stability. Notably, it boosts PoS accuracy by over one standard deviation on seven languages. Our key contributions are twofold: (1) the first empirical identification and analysis of how realignment disrupts lower-layer parameter integrity, and (2) a language-adaptive hierarchical freezing strategy that mitigates representation collapse without task-specific fine-tuning. This work advances principled, parameter-efficient methods for cross-lingual adaptation.
📝 Abstract
Realignment techniques are often employed to enhance cross-lingual transfer in multilingual language models, still, they can sometimes degrade performance in languages that differ significantly from the fine-tuned source language. This paper introduces AlignFreeze, a method that freezes either the layers' lower half or upper half during realignment. Through controlled experiments on 4 tasks, 3 models, and in 35 languages, we find that realignment affects all the layers but can be the most detrimental to the lower ones. Freezing the lower layers can prevent performance degradation. Particularly, AlignFreeze improves Part-of-Speech (PoS) tagging performances in languages where full realignment fails: with XLM-R, it provides improvements of more than one standard deviation in accuracy in seven more languages than full realignment.