π€ AI Summary
This study investigates the extent of original input information preserved in the final-layer token representation of large language models (LLMs). To quantify this, we propose Rep2Textβa lightweight, trainable adapter that maps a single output token into the embedding space of a decoder LLM and reconstructs the input text autoregressively. To our knowledge, this is the first method enabling efficient recovery of multi-token inputs (e.g., 16-token sequences) from a single terminal token, revealing substantial yet underutilized internal redundancy in LLMs and challenging conventional information bottleneck assumptions. Rep2Text supports cross-architecture composition (e.g., using Llamaβs representations to drive OPT decoding) and maintains high semantic fidelity and textual coherence on both in-distribution and out-of-distribution medical texts. Empirical results show an average input information recovery rate exceeding 50%.
π Abstract
Large language models (LLMs) have achieved remarkable progress across diverse tasks, yet their internal mechanisms remain largely opaque. In this work, we address a fundamental question: to what extent can the original input text be recovered from a single last-token representation within an LLM? We propose Rep2Text, a novel framework for decoding full text from last-token representations. Rep2Text employs a trainable adapter that projects a target model's internal representations into the embedding space of a decoding language model, which then autoregressively reconstructs the input text. Experiments on various model combinations (Llama-3.1-8B, Gemma-7B, Mistral-7B-v0.1, Llama-3.2-3B) demonstrate that, on average, over half of the information in 16-token sequences can be recovered from this compressed representation while maintaining strong semantic integrity and coherence. Furthermore, our analysis reveals an information bottleneck effect: longer sequences exhibit decreased token-level recovery while preserving strong semantic integrity. Besides, our framework also demonstrates robust generalization to out-of-distribution medical data.