SLASH the Sink: Sharpening Structural Attention Inside LLMs

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the limited structural understanding of large language models (LLMs) when processing serialized graph data, which hinders their graph reasoning capabilities. The study reveals for the first time that LLMs’ internal attention mechanisms can spontaneously reconstruct graph topology through a distinctive “zigzag” pattern; however, this signal is diluted by attention aggregation effects. To mitigate this issue, the authors propose a training-free, plug-and-play attention redistribution method that enhances alignment between graph topology and token-level adjacency matrices, thereby strengthening the model’s intrinsic structural awareness. Theoretical analysis uncovers a fundamental conflict between anisotropic attention bias and effective graph aggregation. Extensive experiments demonstrate that the proposed approach consistently and significantly improves the performance of diverse LLMs on both pure graph tasks and molecular property prediction benchmarks.

📝 Abstract

Large Language Models (LLMs) show remarkable semantic understanding but often struggle with structural understanding when processing graph topologies in a serialized format. Existing solutions rely on training external graph-based adapters or fine-tuning, which incur high costs and lost generalizability. In this work, we investigate the internal mechanisms of LLMs and present a critical finding: LLMs spontaneously reconstruct the graph's topology internally, evidenced by a distinct "sawtooth" pattern in their attention maps that structurally aligns with the "token-level adjacency matrix". However, this intrinsic structural understanding is diluted by the attention sink. We theoretically formalize this dilution as a representation bottleneck, stemming from a fundamental conflict: the model's anisotropic bias, essential for language tasks, suppresses the topology-aware local aggregation required for graph reasoning. To address this, we propose a training-free solution, named StructuraL Attention SHarpening (Slash), which amplifies this internal structural understanding via a plug-and-play attention redistribution. Experiments on pure graph tasks and molecular prediction validate Slash delivers significant and consistent performance gains across diverse LLMs.

Problem

Research questions and friction points this paper is trying to address.

structural understanding

graph topology

attention sink

large language models

representation bottleneck

Innovation

Methods, ideas, or system contributions that make the work stand out.

structural attention

attention sink

graph reasoning