Concept than Document: Context Compression via AMR-based Conceptual Entropy

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the degradation in reasoning accuracy and increased computational overhead caused by redundant documents in long-context retrieval-augmented generation (RAG) for large language models (LLMs), this paper proposes an unsupervised context compression framework based on Abstract Meaning Representation (AMR) graphs and node-level concept entropy. Our method is the first to integrate AMR’s structured semantic graph representation with fine-grained, entropy-driven quantification of semantic node importance, enabling semantics-preserving, token- and concept-level information filtering—thereby overcoming the coarse-grained limitations of conventional document-level compression. Experiments on PopQA and EntityQuestions demonstrate that our approach achieves an average context length reduction of 62% while improving question-answering accuracy by 3.1–5.7 percentage points over state-of-the-art baselines, validating both its effectiveness and generalizability across diverse knowledge-intensive QA tasks.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) face information overload when handling long contexts, particularly in Retrieval-Augmented Generation (RAG) where extensive supporting documents often introduce redundant content. This issue not only weakens reasoning accuracy but also increases computational overhead. We propose an unsupervised context compression framework that exploits Abstract Meaning Representation (AMR) graphs to preserve semantically essential information while filtering out irrelevant text. By quantifying node-level entropy within AMR graphs, our method estimates the conceptual importance of each node, enabling the retention of core semantics. Specifically, we construct AMR graphs from raw contexts, compute the conceptual entropy of each node, and screen significant informative nodes to form a condensed and semantically focused context than raw documents. Experiments on the PopQA and EntityQuestions datasets show that our method outperforms vanilla and other baselines, achieving higher accuracy while substantially reducing context length. To the best of our knowledge, this is the first work introducing AMR-based conceptual entropy for context compression, demonstrating the potential of stable linguistic features in context engineering.
Problem

Research questions and friction points this paper is trying to address.

LLMs face information overload from redundant content in long contexts
Existing methods weaken reasoning accuracy and increase computational overhead
Need to preserve essential semantics while filtering irrelevant text
Innovation

Methods, ideas, or system contributions that make the work stand out.

AMR graphs preserve semantic information for compression
Conceptual entropy quantifies node importance in AMR
Condensed context improves accuracy and reduces length
🔎 Similar Papers
No similar papers found.
K
Kaize Shi
University of Southern Queensland
X
Xueyao Sun
University of Technology Sydney, The Hong Kong Polytechnic University
Xiaohui Tao
Xiaohui Tao
Full Professor, University of Southern Queensland, Australia
Artificial Intelligencedata miningmachine learningnatural language processingknowledge
L
Lin Li
Wuhan University of Technology
Qika Lin
Qika Lin
National University of Singapore | NTU | XJTU | BIT
Knowledge ReasoningNeurosymbolic AIMulti-modalRobustness & SecurityAI for Healthcare
G
Guandong Xu
The Education University of Hong Kong