LLMs Process Lists With General Filter Heads

📅 2025-10-30

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

This work investigates the internal mechanisms underlying list filtering in large language models (LLMs). Method: Leveraging causal mediation analysis, we identify a sparse set of attention heads—termed “filtering heads”—that implement generic, predicate-based filtering. These heads encode transferable predicate representations, effectively abstracting the functional-programming `filter` operation. Contribution/Results: We provide the first evidence that Transformer architectures can spontaneously evolve human-interpretable, format-/language-/task-agnostic filtering mechanisms. Two distinct strategies are identified: explicit predicate encoding and implicit flag-bit storage. Extensive experiments demonstrate strong generalization and functional portability of these filtering heads across diverse datasets and tasks—including zero-shot transfer to unseen formats and programming languages. Our findings offer novel empirical support for LLMs’ capacity for symbolic manipulation, advancing our understanding of how neural models acquire and execute structured, compositional operations.

Technology Category

Application Category

📝 Abstract

We investigate the mechanisms underlying a range of list-processing tasks in LLMs, and we find that LLMs have learned to encode a compact, causal representation of a general filtering operation that mirrors the generic "filter" function of functional programming. Using causal mediation analysis on a diverse set of list-processing tasks, we find that a small number of attention heads, which we dub filter heads, encode a compact representation of the filtering predicate in their query states at certain tokens. We demonstrate that this predicate representation is general and portable: it can be extracted and reapplied to execute the same filtering operation on different collections, presented in different formats, languages, or even in tasks. However, we also identify situations where transformer LMs can exploit a different strategy for filtering: eagerly evaluating if an item satisfies the predicate and storing this intermediate result as a flag directly in the item representations. Our results reveal that transformer LMs can develop human-interpretable implementations of abstract computational operations that generalize in ways that are surprisingly similar to strategies used in traditional functional programming patterns.

Problem

Research questions and friction points this paper is trying to address.

LLMs encode compact causal representations of filtering operations

Filter heads enable predicate portability across formats and languages

Transformers implement interpretable computational operations like functional programming

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs encode general filtering operation in attention heads

Filter heads enable portable predicate representation across formats

Transformers implement human-interpretable functional programming patterns

🔎 Similar Papers

No similar papers found.