Replay to Remember: Retaining Domain Knowledge in Streaming Language Models

📅 2025-04-24

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address severe catastrophic forgetting and stringent computational/data constraints in continual domain adaptation of large language models (LLMs) under streaming settings, this paper proposes a lightweight and efficient method that synergistically optimizes minimal-sample replay—requiring only a handful of historical examples—with low-rank adaptation (LoRA). It is the first work to validate this approach under realistic streaming conditions. Comprehensive evaluation—including perplexity, semantic similarity, and GPT-assisted human assessment—demonstrates its effectiveness in significantly mitigating forgetting and partially recovering domain-specific knowledge. Experiments across three specialized domains—healthcare, genomics, and law—show that the method achieves both stability and adaptability with negligible computational overhead and storage cost. This work establishes a practical, resource-efficient paradigm for real-time, continual domain adaptation of LLMs in data- and compute-constrained environments.

Technology Category

Application Category

📝 Abstract

Continual learning in large language models (LLMs) typically encounters the critical challenge of catastrophic forgetting, where previously acquired knowledge deteriorates upon exposure to new data. While techniques like replay buffers and parameter-efficient tuning (e.g., Low-Rank Adaptation or LoRA) have been proposed, few studies investigate real-time domain adaptation under strict computational and data-stream constraints. In this paper, we demonstrate a lightweight method combining LoRA and a minimal replay mechanism in a realistic streaming setting across three diverse knowledge domains: medical question answering, genetics, and law. Using perplexity, semantic similarity, and GPT-based human-like evaluation metrics, we quantify the model's adaptation, forgetting, and recovery over time. Our experiments reveal that while catastrophic forgetting naturally occurs, even minimal replay significantly stabilizes and partially restores domain-specific knowledge. This study contributes practical insights for deploying adaptable LLMs in resource-constrained, real-world scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addressing catastrophic forgetting in streaming language models

Enabling real-time domain adaptation with computational constraints

Evaluating knowledge retention across diverse domains using replay

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines LoRA with minimal replay mechanism

Lightweight method for real-time domain adaptation

Quantifies adaptation and forgetting using multiple metrics

🔎 Similar Papers

No similar papers found.