🤖 AI Summary
Large language models (LLMs) exhibit substantially weaker complex reasoning capabilities on low-resource languages (LRLs) compared to English. To address this gap, we propose a two-stage model stacking framework that integrates a multilingual encoder with an LLM, coupled with a multi-phase curriculum-based alignment mechanism: first aligning semantic spaces via bilingual parallel texts, then progressively transferring to task-specific data (e.g., mathematical reasoning). Our method achieves efficient cross-lingual knowledge transfer by fine-tuning only a small number of DoRA low-rank parameters. On AfriMGSM, it improves accuracy by 12.9 percentage points, outperforming MindMerger and GPT-4o-mini; consistent gains are also observed on MGSM and MSVAMP. The core contributions lie in (i) curriculum-driven progressive representation alignment and (ii) a lightweight cross-lingual adaptation strategy—jointly narrowing the reasoning performance gap between LRLs and high-resource languages.
📝 Abstract
Large language models excel in English but still struggle with complex reasoning in many low-resource languages (LRLs). Existing encoder-plus-decoder methods such as LangBridge and MindMerger raise accuracy on mid and high-resource languages, yet they leave a large gap on LRLs. We present MERLIN, a two-stage model-stacking framework that applies a curriculum learning strategy -- from general bilingual bitext to task-specific data -- and adapts only a small set of DoRA weights. On the AfriMGSM benchmark MERLIN improves exact-match accuracy by +12.9 pp over MindMerger and outperforms GPT-4o-mini. It also yields consistent gains on MGSM and MSVAMP (+0.9 and +2.8 pp), demonstrating effectiveness across both low and high-resource settings.