🤖 AI Summary
To address the limited few-shot adaptation capability of large language models (LLMs) without parameter fine-tuning, this paper proposes Context Tuning—a novel prompt-based method that optimizes only task-specific prefix tokens while keeping model weights frozen. Its key innovation lies in initializing trainable prefixes with high-quality demonstration examples, thereby tightly integrating prompt learning with in-context learning (ICL) to more effectively elicit the model’s implicit knowledge. Evaluated across multiple benchmarks—including CrossFit, MMLU, and UnifiedQA—Context Tuning significantly outperforms standard ICL and hand-crafted prompts, achieving accuracy comparable to test-time fine-tuning. Crucially, it reduces training overhead by an order of magnitude, offering both computational efficiency and strong generalization across diverse tasks and domains.
📝 Abstract
We introduce Context Tuning, a simple and effective method to significantly enhance few-shot adaptation of language models (LLMs) without fine-tuning model parameters. While prompt-based adaptation techniques have demonstrated the effectiveness of lightweight adaptation methods for large language models (LLMs), they typically initialize a trainable prompt or prefix with irrelevant tokens for the task at hand. In contrast, Context Tuning initializes the trainable prompt or prefix with task-specific demonstration examples, leveraging the model's inherent In-Context Learning (ICL) ability to extract relevant information for improved few-shot learning performance. Extensive evaluations on benchmarks such as CrossFit, UnifiedQA, MMLU, BIG-Bench Hard, and ARC demonstrate that Context Tuning outperforms traditional prompt-based adaptation methods and achieves competitive accuracy to Test-Time Training with significantly higher training efficiency.