🤖 AI Summary
This work investigates whether large language models (LLMs) possess genuine compositional mechanisms for two-hop factual recall—i.e., computing $y = g(f(x))$—and probes the origins of the “compositionality gap”: the phenomenon where LLMs successfully solve individual subproblems $z = f(x)$ and $y = g(z)$, yet fail on their composition $g(f(x))$. Using residual stream activation analysis, logit lens probing to track intermediate representations, and geometric quantification of embedding-space structure—particularly mapping linearity—the authors identify two parallel processing mechanisms: compositional (explicit stepwise computation) and shortcut (end-to-end direct mapping). Crucially, they demonstrate that geometric properties—especially embedding alignment—strongly bias mechanism selection. The study empirically confirms the existence of the compositionality gap, uncovers its mechanistic underpinnings in residual stream dynamics and embedding geometry, and provides a unified analytical framework. All experimental data and code are publicly released.
📝 Abstract
While large language models (LLMs) appear to be increasingly capable of solving compositional tasks, it is an open question whether they do so using compositional mechanisms. In this work, we investigate how feedforward LLMs solve two-hop factual recall tasks, which can be expressed compositionally as $g(f(x))$. We first confirm that modern LLMs continue to suffer from the "compositionality gap": i.e. their ability to compute both $z = f(x)$ and $y = g(z)$ does not entail their ability to compute the composition $y = g(f(x))$. Then, using logit lens on their residual stream activations, we identify two processing mechanisms, one which solves tasks $ extit{compositionally}$, computing $f(x)$ along the way to computing $g(f(x))$, and one which solves them $ extit{directly}$, without any detectable signature of the intermediate variable $f(x)$. Finally, we find that which mechanism is employed appears to be related to the embedding space geometry, with the idiomatic mechanism being dominant in cases where there exists a linear mapping from $x$ to $g(f(x))$ in the embedding spaces. We fully release our data and code at: https://github.com/apoorvkh/composing-functions .