Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages

📅 2025-02-14

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Low-resource languages (LRLs) suffer from severe data scarcity and inefficient model adaptation in NLP. Method: This paper proposes a lightweight, multi-source collaborative parameter-efficient transfer framework. It leverages compact multilingual models (e.g., mBERT, XLM-R), integrates ≤1GB free-text data (GlotCC) and structured knowledge (ConceptNet), and jointly optimizes three adapter types—Sequential Bottleneck, Invertible Bottleneck, and LoRA. Contribution/Results: We systematically demonstrate for the first time that the Invertible Bottleneck achieves superior downstream performance due to its embedding-space alignment advantage. Our adapter-augmented small-model approach uses only 0.1%–2% of the parameters required for full fine-tuning, matching or exceeding its performance across MLM, topic classification, sentiment analysis, and NER tasks—while significantly outperforming distilled variants of LLaMA-3, GPT-4, and DeepSeek-R1.

Technology Category

Application Category

📝 Abstract

Low-resource languages (LRLs) face significant challenges in natural language processing (NLP) due to limited data. While current state-of-the-art large language models (LLMs) still struggle with LRLs, smaller multilingual models (mLMs) such as mBERT and XLM-R offer greater promise due to a better fit of their capacity to low training data sizes. This study systematically investigates parameter-efficient adapter-based methods for adapting mLMs to LRLs, evaluating three architectures: Sequential Bottleneck, Invertible Bottleneck, and Low-Rank Adaptation. Using unstructured text from GlotCC and structured knowledge from ConceptNet, we show that small adaptation datasets (e.g., up to 1 GB of free-text or a few MB of knowledge graph data) yield gains in intrinsic (masked language modeling) and extrinsic tasks (topic classification, sentiment analysis, and named entity recognition). We find that Sequential Bottleneck adapters excel in language modeling, while Invertible Bottleneck adapters slightly outperform other methods on downstream tasks due to better embedding alignment and larger parameter counts. Adapter-based methods match or outperform full fine-tuning while using far fewer parameters, and smaller mLMs prove more effective for LRLs than massive LLMs like LLaMA-3, GPT-4, and DeepSeek-R1-based distilled models. While adaptation improves performance, pre-training data size remains the dominant factor, especially for languages with extensive pre-training coverage.

Problem

Research questions and friction points this paper is trying to address.

Adapting multilingual models for low-resource languages efficiently.

Evaluating parameter-efficient adapter methods for NLP tasks.

Comparing adapter architectures for better language modeling performance.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapter-based parameter-efficient methods

Small multilingual models adaptation

Sequential and Invertible Bottleneck architectures

🔎 Similar Papers

No similar papers found.