Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Existing interpretability studies of decoder-only Transformers lack a unified framework for understanding graph reasoning—particularly path reasoning and substructure extraction. Method: We propose two core mechanisms—“token merging” and “structural memory”—and develop a circuit-tracing–based analytical framework that combines visualization of reasoning trajectories with quantitative attribution to systematically uncover how such models encode and manipulate graph-structured information. Contribution/Results: Our experiments quantify the impact of graph density and model scale on reasoning behavior and provide, for the first time, a unified mechanistic explanation for two canonical graph reasoning tasks. We identify and quantitatively localize critical attention and feed-forward network (FFN) components essential for graph reasoning. Moreover, we establish the first interpretability paradigm specifically tailored to decoder-only architectures for graph reasoning, bridging a key gap in neural-symbolic reasoning analysis.

Technology Category

Application Category

📝 Abstract

Transformer-based LLMs demonstrate strong performance on graph reasoning tasks, yet their internal mechanisms remain underexplored. To uncover these reasoning process mechanisms in a fundamental and unified view, we set the basic decoder-only transformers and explain them using the circuit-tracer framework. Through this lens, we visualize reasoning traces and identify two core mechanisms in graph reasoning: token merging and structural memorization, which underlie both path reasoning and substructure extraction tasks. We further quantify these behaviors and analyze how they are influenced by graph density and model size. Our study provides a unified interpretability framework for understanding structural reasoning in decoder-only Transformers.

Problem

Research questions and friction points this paper is trying to address.

Uncovering internal reasoning mechanisms in decoder-only transformers for graph tasks

Identifying core mechanisms like token merging and structural memorization in graphs

Analyzing how graph density and model size influence structural reasoning behaviors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Circuit-tracer framework for explaining transformers

Identifies token merging and structural memorization mechanisms

Analyzes impact of graph density and model size

🔎 Similar Papers

Knowledge Circuits in Pretrained Transformers