Understanding Task Vectors in In-Context Learning: Emergence, Functionality, and Limitations

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work investigates the emergence mechanism, functional boundaries, and theoretical nature of task vectors in in-context learning (ICL). We propose the “Linear Combination Hypothesis”: task vectors are equivalent to linear superpositions of context examples and can be replaced by a single demonstration. First, we reveal their natural emergence via loss landscape analysis. Then, leveraging linear Transformer modeling, saliency analysis, and parameter visualization, we theoretically derive and empirically validate the hypothesis. We identify that task vectors intrinsically implement low-rank mappings—explaining their failure on high-rank tasks. Furthermore, we design a multi-task vector injection prompting strategy, achieving significant gains in few-shot generalization. Our work unifies the geometric interpretation, capability limits, and optimization dynamics of task vectors, establishing a new paradigm for controllable ICL.

Technology Category

Application Category

📝 Abstract

Task vectors offer a compelling mechanism for accelerating inference in in-context learning (ICL) by distilling task-specific information into a single, reusable representation. Despite their empirical success, the underlying principles governing their emergence and functionality remain unclear. This work proposes the Linear Combination Conjecture, positing that task vectors act as single in-context demonstrations formed through linear combinations of the original ones. We provide both theoretical and empirical support for this conjecture. First, we show that task vectors naturally emerge in linear transformers trained on triplet-formatted prompts through loss landscape analysis. Next, we predict the failure of task vectors on representing high-rank mappings and confirm this on practical LLMs. Our findings are further validated through saliency analyses and parameter visualization, suggesting an enhancement of task vectors by injecting multiple ones into few-shot prompts. Together, our results advance the understanding of task vectors and shed light on the mechanisms underlying ICL in transformer-based models.

Problem

Research questions and friction points this paper is trying to address.

Understanding emergence and functionality of task vectors in ICL

Investigating limitations of task vectors in high-rank mappings

Proposing enhancement of task vectors via few-shot prompts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Task vectors distill task-specific information for reuse

Linear Combination Conjecture explains task vector formation

Enhance task vectors by injecting multiple ones

🔎 Similar Papers

Distributed Rule Vectors is A Key Mechanism in Large Language Models' In-Context Learning