The Truth Lies Somewhere in the Middle (of the Generated Tokens)

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work investigates effective strategies for aggregating hidden states during the generation process of autoregressive language models to construct more semantically informative representations. Recognizing that semantic information is often distributed across multiple generated tokens rather than localized in a single position, the authors propose mean pooling over the hidden states of the generated sequence and evaluate representation quality via kernel alignment in a multimodal reference space. Experimental results demonstrate that this approach significantly outperforms representations derived from individual tokens or input prompts across diverse domains—including language, vision, and protein sequences—yielding richer semantic representations and revealing interpretable dynamics in how representations evolve throughout the generation process.

📝 Abstract

How should hidden states generated autoregressively be collapsed into a representation that reflects a language model's internal state? Despite tokens being generated under causal masking, we find that mean pooling across their hidden states yields more semantic representations than any individual token alone. We quantify this through kernel alignment to reference spaces in language, vision, and protein domains. The improvement through mean pooling is consistent with information being distributed across generated tokens rather than localized to a single position. Furthermore, representations derived from generated tokens outperform those from prompt tokens, and alignment across generation reveals interpretable dynamics in model behavior.

Problem

Research questions and friction points this paper is trying to address.

hidden states

representation

autoregressive generation

mean pooling

language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

mean pooling

hidden states

autoregressive generation