🤖 AI Summary
This work addresses the high computational cost, substantial memory footprint, and inefficiency of conventional large language model (LLM) adaptation methods—such as supervised fine-tuning (SFT)—which struggle to perform context-aware adaptation within a single forward pass. To overcome these limitations, the authors propose SHINE (Scalable Hypernetwork for Contextual Adaptation), a novel framework that freezes the backbone LLM parameters and employs a context-driven hypernetwork to dynamically generate high-quality LoRA adapters during a single inference step, effectively translating external contextual knowledge into internal model parameters. SHINE incorporates an innovative two-stage training pipeline comprising pre-training followed by instruction tuning, significantly enhancing both expressiveness and scalability. Experimental results demonstrate that SHINE achieves superior performance across multiple tasks while substantially reducing time, computational, and memory costs compared to SFT.
📝 Abstract
We propose SHINE (Scalable Hyper In-context NEtwork), a scalable hypernetwork that can map diverse meaningful contexts into high-quality LoRA adapters for large language models (LLM). By reusing the frozen LLM's own parameters in an in-context hypernetwork design and introducing architectural innovations, SHINE overcomes key limitations of prior hypernetworks and achieves strong expressive power with a relatively small number of parameters. We introduce a pretraining and instruction fine-tuning pipeline, and train our hypernetwork to generate high quality LoRA adapters from diverse meaningful contexts in a single forward pass. It updates LLM parameters without any fine-tuning, and immediately enables complex question answering tasks related to the context without directly accessing the context, effectively transforming in-context knowledge to in-parameter knowledge in one pass. Our work achieves outstanding results on various tasks, greatly saves time, computation and memory costs compared to SFT-based LLM adaptation, and shows great potential for scaling. Our code is available at https://github.com/Yewei-Liu/SHINE