Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing prompt compression methods rely on task-specific questions or handcrafted templates, limiting their generalizability. This paper proposes Task-Agnostic Prompt Compression (TPC), a framework that achieves cross-task and cross-domain compression without requiring any task priors—such as question inputs or predefined templates. Our approach features three key innovations: (1) multi-granularity importance scoring based on context-aware sentence embeddings; (2) a reinforcement learning–driven task descriptor that jointly models contextual relevance and generates adaptive compression policies in an end-to-end manner; and (3) native support for multi-scale model deployment. Experiments on LongBench and ZeroSCROLLS demonstrate that TPC surpasses state-of-the-art methods: it delivers superior performance for large models while significantly reducing the parameter count of base models without sacrificing accuracy—achieving an optimal trade-off between efficiency and generalization.

Technology Category

Application Category

📝 Abstract
The rise of Large Language Models (LLMs) has led to significant interest in prompt compression, a technique aimed at reducing the length of input prompts while preserving critical information. However, the prominent approaches in prompt compression often require explicit questions or handcrafted templates for compression, limiting their generalizability. We propose Task-agnostic Prompt Compression (TPC), a novel framework that generalizes compression across tasks and domains without requiring input questions or templates. TPC generates a context-relevant task description using a task descriptor trained on a curated dataset of context and query pairs, and fine-tuned via reinforcement learning with a reward function designed to capture the most relevant information. The task descriptor is then utilized to compute the relevance of each sentence in the prompt to generate the compressed prompt. We introduce 3 model sizes (Base, Large, and Huge), where the largest model outperforms the existing state-of-the-art methods on LongBench and ZeroSCROLLS benchmarks, and our smallest model performs comparable to the existing solutions while being considerably smaller.
Problem

Research questions and friction points this paper is trying to address.

Generalize prompt compression across tasks
Eliminate need for input questions
Enhance compression with context-aware embeddings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-agnostic Prompt Compression
Context-aware Sentence Embedding
Reinforcement Learning Reward Function
🔎 Similar Papers
No similar papers found.
B
Barys Liskavets
Alterra AI, Palo Alto, United States
Shuvendu Roy
Shuvendu Roy
Queen's University | RBC Borealis; Former: Student Researcher @Google, Intern @Vector Institute
Computer VisionUnsupervised Learning
M
Maxim Ushakov
Alterra AI, Palo Alto, United States
M
Mark Klibanov
Workday Inc.
A
A. Etemad
Queen’s University, Canada
S
Shane Luke
Workday Inc.