Mechanism of Task-oriented Information Removal in In-context Learning

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

How do language models enhance few-shot generalization via in-context learning (ICL), particularly through implicit information processing mechanisms? Method: We propose a low-rank filtering-based analysis of hidden states to identify “denoising attention heads”—components that actively suppress redundant contextual information while amplifying task-relevant representations—under zero-shot and few-shot settings. We design an interpretable metric to detect these heads and conduct ablation studies by masking them. Contribution/Results: Ablation results show significant accuracy degradation upon masking denoising heads, confirming their causal role in ICL. Crucially, this implicit information filtering mechanism operates robustly even without correct label examples, consistently improving inference accuracy. Our work is the first to reveal that ICL intrinsically involves such a denoising process, offering a novel mechanistic interpretation grounded in empirically validated computational evidence. This advances the understanding of ICL’s internal workings and provides testable hypotheses for future research.

Technology Category

Application Category

📝 Abstract

In-context Learning (ICL) is an emerging few-shot learning paradigm based on modern Language Models (LMs), yet its inner mechanism remains unclear. In this paper, we investigate the mechanism through a novel perspective of information removal. Specifically, we demonstrate that in the zero-shot scenario, LMs encode queries into non-selective representations in hidden states containing information for all possible tasks, leading to arbitrary outputs without focusing on the intended task, resulting in near-zero accuracy. Meanwhile, we find that selectively removing specific information from hidden states by a low-rank filter effectively steers LMs toward the intended task. Building on these findings, by measuring the hidden states on carefully designed metrics, we observe that few-shot ICL effectively simulates such task-oriented information removal processes, selectively removing the redundant information from entangled non-selective representations, and improving the output based on the demonstrations, which constitutes a key mechanism underlying ICL. Moreover, we identify essential attention heads inducing the removal operation, termed Denoising Heads, which enables the ablation experiments blocking the information removal operation from the inference, where the ICL accuracy significantly degrades, especially when the correct label is absent from the few-shot demonstrations, confirming both the critical role of the information removal mechanism and denoising heads.

Problem

Research questions and friction points this paper is trying to address.

Understanding how in-context learning removes task-irrelevant information

Investigating how language models filter redundant information via hidden states

Identifying denoising heads that enable selective information removal

Innovation

Methods, ideas, or system contributions that make the work stand out.

Selectively removes task-irrelevant information from hidden states

Identifies denoising attention heads enabling information removal

Simulates task-oriented information removal through few-shot demonstrations

🔎 Similar Papers

Unlearnable Algorithms for In-context Learning