Vector Arithmetic in Concept and Token Subspaces

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work investigates the structured separation of semantic and surface-level information within the hidden states of large language models (LLMs). Focusing on Llama-2-7b, the authors identify concept-inducing and token-inducing attention heads that respectively encode high-level semantics and low-level morphological features. Leveraging this observation, they construct orthogonal semantic and token subspaces from the model’s internal representations. By applying attention-weight-based transformations, they extract subspace projections and, for the first time, enable disentangled vector arithmetic directly on hidden states: semantic analogies (e.g., “Athens − Greece + China = Beijing”) achieve 80% nearest-neighbor accuracy in the concept subspace—substantially surpassing the 47% attained on raw hidden states—while the token subspace precisely recovers lexical and morphological properties. This reveals an intrinsic geometric separability in LLM representations, establishing a novel paradigm for interpretable modeling and controllable reasoning.

Technology Category

Application Category

📝 Abstract

In order to predict the next token, LLMs must represent semantic and surface-level information about the current word. Previous work identified two types of attention heads that disentangle this information: (i) Concept induction heads, which copy word meanings, and (ii) Token induction heads, which copy literal token representations (Feucht et al., 2025). We show that these heads can be used to identify subspaces of model activations that exhibit coherent semantic structure in Llama-2-7b. Specifically, when we transform hidden states using the attention weights of concept heads, we are able to more accurately perform parallelogram arithmetic (Mikolov et al., 2013) on the resulting hidden states, e.g., showing that "Athens" - "Greece" + "China" = "Beijing". This transformation allows for much higher nearest-neighbor accuracy (80%) than direct use of raw hidden states (47%). Analogously, we show that token heads allow for transformations that reveal surface-level word information in hidden states, allowing for operations like "coding" - "code" + "dance" = "dancing".

Problem

Research questions and friction points this paper is trying to address.

Identifying semantic subspaces in LLMs using concept induction heads

Enhancing word analogy accuracy through transformed hidden states

Revealing surface-level token relationships via token induction heads

Innovation

Methods, ideas, or system contributions that make the work stand out.

Concept heads identify semantic subspaces for analogy tasks

Token heads reveal surface-level transformations in hidden states

Attention weights enable vector arithmetic with high accuracy

🔎 Similar Papers

No similar papers found.