🤖 AI Summary
This study addresses a critical limitation in existing cross-lingual transfer approaches: the lack of control over total training data when selecting source languages, which conflates the effects of language choice and data volume, thereby hindering effective support for low-resource African languages in NLP tasks. To resolve this, the work formulates multi-source cross-lingual transfer as a resource allocation problem under a fixed annotation budget, jointly optimizing source language selection and data distribution ratios. Using mBERT and XLM-R, the authors conduct 288 experiments on Hausa, Yoruba, and Swahili to systematically evaluate four strategies across named entity recognition (NER) and sentiment analysis. Results demonstrate that multi-source transfer significantly outperforms single-source transfer (Cohen’s d = 0.80–1.98), differences among allocation strategies are marginal, and the efficacy of embedding similarity as a proxy for source selection is task-dependent—random selection excels in NER, whereas similarity-based selection performs better in sentiment analysis.
📝 Abstract
Cross-lingual transfer learning enables NLP for low-resource languages by leveraging labeled data from higher-resource sources, yet existing comparisons of source language selection strategies do not control for total training data, confounding language selection effects with data quantity effects. We introduce Budget-Xfer, a framework that formulates multi-source cross-lingual transfer as a budget-constrained resource allocation problem. Given a fixed annotation budget B, our framework jointly optimizes which source languages to include and how much data to allocate from each. We evaluate four allocation strategies across named entity recognition and sentiment analysis for three African target languages (Hausa, Yoruba, Swahili) using two multilingual models, conducting 288 experiments. Our results show that (1) multi-source transfer significantly outperforms single-source transfer (Cohen's d = 0.80 to 1.98), driven by a structural budget underutilization bottleneck; (2) among multi-source strategies, differences are modest and non-significant; and (3) the value of embedding similarity as a selection proxy is task-dependent, with random selection outperforming similarity-based selection for NER but not sentiment analysis.