Implicit In-context Learning

📅 2024-05-23
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In-context learning (ICL) enables few-shot adaptation but suffers from high computational overhead and sensitivity to demonstration selection and ordering. Method: We propose Implicit In-Context Learning (I2CL), which compresses demonstrations into learnable context vectors and injects them via residual-flow-based linear combinations into the model’s activation space—enabling few-shot performance at zero-shot inference cost. Contribution/Results: I2CL is the first method enabling zero-overhead implicit ICL modeling; it introduces task-ID representations and similarity-aware metrics to enhance cross-task generalization, and integrates context vector distillation with multi-architecture adaptation. Evaluated across nine real-world tasks and three mainstream large language models, I2CL matches standard ICL’s few-shot accuracy while incurring inference costs identical to zero-shot baselines and exhibiting strong robustness to demonstration order and composition.

Technology Category

Application Category

📝 Abstract
In-context Learning (ICL) empowers large language models (LLMs) to swiftly adapt to unseen tasks at inference-time by prefixing a few demonstration examples before queries. Despite its versatility, ICL incurs substantial computational and memory overheads compared to zero-shot learning and is sensitive to the selection and order of demonstration examples. In this work, we introduce Implicit In-context Learning (I2CL), an innovative paradigm that reduces the inference cost of ICL to that of zero-shot learning with minimal information loss. I2CL operates by first generating a condensed vector representation, namely a context vector, extracted from the demonstration examples. It then conducts an inference-time intervention through injecting a linear combination of the context vector and query activations back into the model's residual streams. Empirical evaluation on nine real-world tasks across three model architectures demonstrates that I2CL achieves few-shot level performance at zero-shot inference cost, and it exhibits robustness against variations in demonstration examples. Furthermore, I2CL facilitates a novel representation of task-ids, enhancing task similarity detection and fostering effective transfer learning. We also perform a comprehensive analysis and ablation study on I2CL, offering deeper insights into its internal mechanisms. Code is available at https://github.com/LzVv123456/I2CL.
Problem

Research questions and friction points this paper is trying to address.

Reduces ICL computational and memory overheads
Improves robustness against demonstration variations
Enhances task similarity detection and transfer learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reduces ICL inference cost significantly
Generates condensed context vector representation
Enhances task similarity detection effectively