🤖 AI Summary
This work addresses the challenge of efficiently integrating private, dynamically evolving domain expertise into large language models for high-stakes applications in biomedicine, materials science, and finance. While conventional fine-tuning is costly and prone to catastrophic forgetting, and retrieval-augmented generation (RAG) suffers from fragmented evidence and retrieval drift, the proposed Generation-Augmented Generation (GAG) framework introduces a novel paradigm by modeling private knowledge as an “expert modality.” GAG injects this knowledge into a frozen base model via representation-level alignment—eliminating the need for fine-tuning or lengthy context prompts. The approach enables plug-and-play domain specialization, seamless multi-domain composition, and reliable selective activation. Experiments demonstrate that GAG outperforms strong RAG baselines by 15.34% and 14.86% on two proprietary scientific question-answering benchmarks while preserving performance across six general-purpose benchmarks, with multi-domain activation approaching ideal efficacy.
📝 Abstract
In domains such as biomedicine, materials, and finance, high-stakes deployment of large language models (LLMs) requires injecting private, domain-specific knowledge that is proprietary, fast-evolving, and under-represented in public pretraining. However, the two dominant paradigms for private knowledge injection each have pronounced drawbacks: fine-tuning is expensive to iterate, and continual updates risk catastrophic forgetting and general-capability regression; retrieval-augmented generation (RAG) keeps the base model intact but is brittle in specialized private corpora due to chunk-induced evidence fragmentation, retrieval drift, and long-context pressure that yields query-dependent prompt inflation. Inspired by how multimodal LLMs align heterogeneous modalities into a shared semantic space, we propose Generation-Augmented Generation (GAG), which treats private expertise as an additional expert modality and injects it via a compact, representation-level interface aligned to the frozen base model, avoiding prompt-time evidence serialization while enabling plug-and-play specialization and scalable multi-domain composition with reliable selective activation. Across two private scientific QA benchmarks (immunology adjuvant and catalytic materials) and mixed-domain evaluations, GAG improves specialist performance over strong RAG baselines by 15.34% and 14.86% on the two benchmarks, respectively, while maintaining performance on six open general benchmarks and enabling near-oracle selective activation for scalable multi-domain deployment. Code is publicly available at https://github.com/360CVGroup/GAG.