Continual-learning for Modelling Low-Resource Languages from Large Language Models

📅 2026-01-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses catastrophic forgetting in large language models (LLMs) when transferring to low-resource languages by proposing an innovative continual learning approach that integrates part-of-speech (POS)-guided code-switching with a replay buffer adapter mechanism. The method effectively enhances modeling capabilities for low-resource languages while preserving knowledge from the source language. Experimental results on both language modeling and visual question answering tasks demonstrate its efficacy, showing significant mitigation of catastrophic forgetting in multilingual settings and outperforming existing baselines.

Technology Category

Application Category

📝 Abstract
Modelling a language model for a multi-lingual scenario includes several potential challenges, among which catastrophic forgetting is the major challenge. For example, small language models (SLM) built for low-resource languages by adapting large language models (LLMs) pose the challenge of catastrophic forgetting. This work proposes to employ a continual learning strategy using parts-of-speech (POS)-based code-switching along with a replay adapter strategy to mitigate the identified gap of catastrophic forgetting while training SLM from LLM. Experiments conducted on vision language tasks such as visual question answering and language modelling task exhibits the success of the proposed architecture.
Problem

Research questions and friction points this paper is trying to address.

catastrophic forgetting
low-resource languages
continual learning
large language models
small language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

continual learning
catastrophic forgetting
low-resource languages
POS-based code-switching
replay adapter
🔎 Similar Papers
No similar papers found.
S
Santosh Srinath
Machine Intelligence Group, Department of CS&IS, Birla Institute of Technology and Sciences, Pilani, India
V
Varun Mudit Somani
Machine Intelligence Group, Department of CS&IS, Birla Institute of Technology and Sciences, Pilani, India
P
Prajna Reddy Padala
Machine Intelligence Group, Department of CS&IS, Birla Institute of Technology and Sciences, Pilani, India
D
Devi Upadhyay
Machine Intelligence Group, Department of CS&IS, Birla Institute of Technology and Sciences, Pilani, India
Abhijit Das
Abhijit Das
BITS Pilani Hyderabad, Dept of CS&IS
Computer VisionPattern RecognitionMachine Learning