Belief or Circuitry? Causal Evidence for In-Context Graph Learning

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This study investigates whether large language models rely on local pattern matching or infer latent graph structures during in-context learning. By designing random-walk tasks over two competing graph topologies and employing causal probing techniques—including PCA-based representation analysis, residual stream activation patching, and graph-difference linear steering—the work reveals that models concurrently encode both global topology and local transition information. The findings demonstrate that intermediate layers maintain orthogonal representations of the two graph structures, that late-layer interventions can transfer graph preferences across contexts, and that linear steering is both effective and highly specific. These results support a dual-mechanism account of in-context graph learning, wherein belief-like reasoning and inductive circuitry operate in parallel.

📝 Abstract

How do LLMs learn in-context? Is it by pattern-matching recent tokens, or by inferring latent structure? We probe this question using a toy graph random-walk across two competing graph structures. This task's answer is, in principle, decidable: either the model tracks global topology, or it copies local transitions. We present two lines of evidence that neither account alone is sufficient. First, reconstructing the internal representation structure via PCA reveals that at intermediate mixture ratios, both graph topologies are encoded in orthogonal principal subspaces simultaneously. This pattern is difficult to reconcile with purely local transition copying. Second, residual-stream activation patching and graph-difference steering causally intervene on this graph-family signal: late-layer patching almost fully transfers the clean graph preference, while linear steering moves predictions in the intended direction and fails under norm-matched and label-shuffled controls. Taken together, our findings are most consistent with a dual-mechanism account in which genuine structure inference and induction circuits operate in parallel.

Problem

Research questions and friction points this paper is trying to address.

in-context learning

graph structure

latent inference

large language models

mechanistic interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

in-context learning

causal intervention

graph structure inference