Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

140K/year

🤖 AI Summary

To address the problem of redundant and off-topic retrieved contexts in Retrieval-Augmented Generation (RAG), which degrade large language model (LLM) question-answering performance, this paper introduces the first unsupervised integration of pragmatic principles—specifically Grice’s conversational maxims—into the RAG pipeline. Our method employs semantic coverage analysis to identify sentences strictly relevant to the query and proposes a context-aware, non-truncating highlighting mechanism that distills informative content without altering the original passage structure. The approach requires no fine-tuning, human annotation, or model modification, and is fully compatible with mainstream dense passage retrieval frameworks. Evaluated on ARC-Challenge, PubHealth, and PopQA across five LLMs, it consistently improves accuracy—by up to 19.7% on PubHealth and 10% on ARC-Challenge. Our core contribution is the principled incorporation of pragmatics into RAG, establishing an interpretable, low-overhead paradigm for context optimization.

Technology Category

Application Category

📝 Abstract

We propose a simple, unsupervised method that injects pragmatic principles in retrieval-augmented generation (RAG) frameworks such as Dense Passage Retrieval~cite{karpukhin2020densepassageretrievalopendomain} to enhance the utility of retrieved contexts. Our approach first identifies which sentences in a pool of documents retrieved by RAG are most relevant to the question at hand, cover all the topics addressed in the input question and no more, and then highlights these sentences within their context, before they are provided to the LLM, without truncating or altering the context in any other way. We show that this simple idea brings consistent improvements in experiments on three question answering tasks (ARC-Challenge, PubHealth and PopQA) using five different LLMs. It notably enhances relative accuracy by up to 19.7% on PubHealth and 10% on ARC-Challenge compared to a conventional RAG system.

Problem

Research questions and friction points this paper is trying to address.

Enhancing retrieval-augmented generation with pragmatics

Improving relevance and coverage in question answering

Boosting accuracy in diverse QA tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised pragmatic principles injection

Relevance-based sentence identification

Contextual highlighting without truncation

🔎 Similar Papers

No similar papers found.