In-Context Learning Operates as Concept Subspace Learning

๐Ÿ“… 2026-05-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

208K/year
๐Ÿค– AI Summary
This work investigates whether in-context learning (ICL) performs structured task inference through a low-dimensional conceptual subspace. The authors propose a โ€œconceptual subspaceโ€ perspective, modeling ICL as regression in a low-dimensional intrinsic conceptual coordinate system, and reveal for the first time that task-relevant information is concentrated within a low-dimensional, task-aligned subspace of the high-dimensional activation space. They validate the causal role of this subspace using ridge regression and least-squares proxy models, covariance structure analysis, residual stream patching, and concept-swapping interventions. Experiments show that on Llama-3-8B, as few as 68โ€“73 dimensions recover 78.8% of the accuracy gap, while complementary subspaces are ineffective. These findings are consistently replicated on Qwen2.5-7B and across multilingual tasks.
๐Ÿ“ Abstract
Regression and Bayesian accounts of in-context learning (ICL) explain how demonstrations can induce predictors, while mechanistic analyses often identify compact activation directions that steer prompted behavior. However, it remains unclear whether structured demonstrations induce low-dimensional concept inference. We study this question through a concept-subspace view of ICL, in which tasks vary only along intrinsic concept coordinates, although inputs are observed in a high-dimensional ambient space. For ridge and least-squares ICL proxies, prediction decomposes exactly into concept-coordinate regression and off-subspace leakage. Under block-diagonal or near-block-diagonal covariance assumptions, the leading estimation and nuisance-sensitivity terms scale with the dimension of the concept subspace, while residual effects are controlled by cross-subspace coupling. This separation gives a mechanistic prediction: recoverable task information should concentrate in a low-dimensional, task-aligned activation subspace. On CounterFact-derived multi-relation prompts with Llama-3-8B, a 68--73-dimensional subspace of the 4096-dimensional residual stream restores 78.8% of the clean--corrupted accuracy gap, whereas patching the complementary subspace restores 0%. Concept swaps redirect predictions toward injected relations, while random and cross-task matched-rank controls are largely ineffective. Additional experiments on Qwen2.5-7B and a controlled cross-lingual rule task show the same qualitative pattern. These results support concept subspaces as compact, task-aligned mediators of recoverable ICL behavior in structured task families, without implying full-circuit recovery.
Problem

Research questions and friction points this paper is trying to address.

in-context learning
concept subspace
low-dimensional inference
structured demonstrations
activation subspace
Innovation

Methods, ideas, or system contributions that make the work stand out.

concept subspace
in-context learning
activation steering
low-dimensional representation
mechanistic interpretability
๐Ÿ”Ž Similar Papers