Efficient Continual Learning in Neural Machine Translation: A Low-Rank Adaptation Approach

📅 2025-12-10

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Neural machine translation (NMT) faces dual challenges in continual learning: catastrophic forgetting and high computational overhead. To address these, we propose an efficient continual adaptation framework supporting multilingualism, multi-domain, and stylistic transfer. Methodologically, we first introduce a gradient-history-aware regularization strategy that explicitly constrains the evolution path of low-rank update matrices to mitigate forgetting. Second, we design a gate-free, linearly composable LoRA module architecture enabling interactive, user-controllable adaptation without retraining. Built upon parameter-efficient fine-tuning (PEFT), our approach achieves performance on par with full-parameter fine-tuning across multi-task continual learning scenarios, while reducing trainable parameters by over 90%. This significantly enhances model stability and real-time modality switching capability. Our core contributions lie in the synergistic integration of gradient-history-aware regularization and a composable LoRA architecture, establishing a new paradigm for efficient, adaptive NMT.

Technology Category

Application Category

📝 Abstract

Continual learning in Neural Machine Translation (NMT) faces the dual challenges of catastrophic forgetting and the high computational cost of retraining. This study establishes Low-Rank Adaptation (LoRA) as a parameter-efficient framework to address these challenges in dedicated NMT architectures. We first demonstrate that LoRA-based fine-tuning adapts NMT models to new languages and domains with performance on par with full-parameter techniques, while utilizing only a fraction of the parameter space. Second, we propose an interactive adaptation method using a calibrated linear combination of LoRA modules. This approach functions as a gate-free mixture of experts, enabling real-time, user-controllable adjustments to domain and style without retraining. Finally, to mitigate catastrophic forgetting, we introduce a novel gradient-based regularization strategy specifically designed for low-rank decomposition matrices. Unlike methods that regularize the full parameter set, our approach weights the penalty on the low-rank updates using historical gradient information. Experimental results indicate that this strategy efficiently preserves prior domain knowledge while facilitating the acquisition of new tasks, offering a scalable paradigm for interactive and continual NMT.

Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting and high computational costs in continual NMT learning.

Enables real-time user-controllable adjustments to domain and style without retraining.

Preserves prior knowledge while acquiring new tasks via gradient-based regularization.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-Rank Adaptation for parameter-efficient fine-tuning

Interactive adaptation via calibrated combination of LoRA modules

Gradient-based regularization on low-rank matrices to prevent forgetting

🔎 Similar Papers

No similar papers found.