Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

📅 2025-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current LLMs, constrained by autoregressive sequential modeling, struggle to capture structural dependencies—such as graph relationships—among text segments, limiting their effectiveness in RAG and graph-structured reasoning tasks. To address this, we propose a structure-aware KV caching mechanism featuring a novel graph-structured block masking attention: it decouples positional encoding from topological dependency, enabling each target segment to attend exclusively to the KV representations of designated source segments, thereby realizing graph-guided sparse attention and message-passing-style context aggregation. The method supports end-to-end training without modifying the model backbone. Evaluated on seven RAG benchmarks, Arxiv-QA (a graph-based QA task), and citation network classification, our approach consistently outperforms sequential baselines, significantly mitigating positional bias while enhancing long-range and multi-hop reasoning capabilities.

Technology Category

Application Category

📝 Abstract
Modern large language models (LLMs) are inherently auto-regressive, requiring input to be serialized into flat sequences regardless of their structural dependencies. This serialization hinders the model's ability to leverage structural inductive biases, especially in tasks such as retrieval-augmented generation (RAG) and reasoning on data with native graph structures, where inter-segment dependencies are crucial. We introduce Graph-KV with the potential to overcome this limitation. Graph-KV leverages the KV-cache of text segments as condensed representations and governs their interaction through structural inductive biases. In this framework, 'target' segments selectively attend only to the KV-caches of their designated 'source' segments, rather than all preceding segments in a serialized sequence. This approach induces a graph-structured block mask, sparsifying attention and enabling a message-passing-like step within the LLM. Furthermore, strategically allocated positional encodings for source and target segments reduce positional bias and context window consumption. We evaluate Graph-KV across three scenarios: (1) seven RAG benchmarks spanning direct inference, multi-hop reasoning, and long-document understanding; (2) Arxiv-QA, a novel academic paper QA task with full-text scientific papers structured as citation ego-graphs; and (3) paper topic classification within a citation network. By effectively reducing positional bias and harnessing structural inductive biases, Graph-KV substantially outperforms baselines, including standard costly sequential encoding, across various settings. Code and the Graph-KV data are publicly available.
Problem

Research questions and friction points this paper is trying to address.

Overcomes LLMs' serialization limitation for structural data
Enhances retrieval-augmented generation via graph-structured attention
Reduces positional bias in graph-based reasoning tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

KV-cache as condensed segment representations
Graph-structured block mask sparsifies attention
Strategic positional encoding reduces bias
🔎 Similar Papers
No similar papers found.
H
Haoyu Wang
Georgia Institute of Technology
P
Peihao Wang
The University of Texas at Austin
Mufei Li
Mufei Li
Georgia Institute of Technology
Large Language ModelAgent
Shikun Liu
Shikun Liu
Research Scientist, Meta AI
Machine LearningComputer Vision
Siqi Miao
Siqi Miao
Georgia Institute of Technology
Machine LearningGeometric Deep LearningGraph Neural NetworksAI for Science
Z
Zhangyang Wang
The University of Texas at Austin
P
Pan Li
Georgia Institute of Technology