🤖 AI Summary
In-context learning (ICL) enables few-shot adaptation but suffers from high computational overhead and sensitivity to demonstration selection and ordering. Method: We propose Implicit In-Context Learning (I2CL), which compresses demonstrations into learnable context vectors and injects them via residual-flow-based linear combinations into the model’s activation space—enabling few-shot performance at zero-shot inference cost. Contribution/Results: I2CL is the first method enabling zero-overhead implicit ICL modeling; it introduces task-ID representations and similarity-aware metrics to enhance cross-task generalization, and integrates context vector distillation with multi-architecture adaptation. Evaluated across nine real-world tasks and three mainstream large language models, I2CL matches standard ICL’s few-shot accuracy while incurring inference costs identical to zero-shot baselines and exhibiting strong robustness to demonstration order and composition.
📝 Abstract
In-context Learning (ICL) empowers large language models (LLMs) to swiftly adapt to unseen tasks at inference-time by prefixing a few demonstration examples before queries. Despite its versatility, ICL incurs substantial computational and memory overheads compared to zero-shot learning and is sensitive to the selection and order of demonstration examples. In this work, we introduce Implicit In-context Learning (I2CL), an innovative paradigm that reduces the inference cost of ICL to that of zero-shot learning with minimal information loss. I2CL operates by first generating a condensed vector representation, namely a context vector, extracted from the demonstration examples. It then conducts an inference-time intervention through injecting a linear combination of the context vector and query activations back into the model's residual streams. Empirical evaluation on nine real-world tasks across three model architectures demonstrates that I2CL achieves few-shot level performance at zero-shot inference cost, and it exhibits robustness against variations in demonstration examples. Furthermore, I2CL facilitates a novel representation of task-ids, enhancing task similarity detection and fostering effective transfer learning. We also perform a comprehensive analysis and ablation study on I2CL, offering deeper insights into its internal mechanisms. Code is available at https://github.com/LzVv123456/I2CL.