Submodular Context Partitioning and Compression for In-Context Learning-short paper

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

To address the efficiency bottleneck of long-context in-context learning (ICL) caused by Transformer’s quadratic complexity, as well as issues of example redundancy and insufficient representativeness, this paper proposes a submodular function-based, block-aware context selection framework. It partitions inputs into semantically coherent context blocks and designs a submodular objective function that jointly optimizes for diversity and local coherence, enabling globally optimal block-level selection. Integrated with semantic compression and efficient attention mechanisms, the framework supports precomputation and fast inference. Extensive experiments across multiple tasks, diverse datasets, and various model scales demonstrate significant improvements in ICL performance, alongside strong scalability and stability. This work is the first to systematically introduce submodular optimization into context block selection, establishing a novel paradigm for efficient long-context utilization in large language models.

Technology Category

Application Category

📝 Abstract

In-context learning (ICL) enables efficient few-shot learning in large language models (LLMs) without training, but suffers from the quadratic input complexity of transformers, limiting the maximum number of exemplars. While various efficient ICL approaches partition the context into blocks to process (e.g., ensembling, compression, cross-attention), they often ignore the information redundancy or under-representation caused by different partition strategies, leading to suboptimal performance. To tackle this problem, we propose Sub-CP, a block-aware context selection framework that leverages submodular objectives to control block diversity. Sub-CP supports a flexible spectrum of selection strategies, allowing each block to range from globally diverse to locally coherent. This allows fine-grained control over semantic structure while enabling precomputation. Extensive experiments across diverse tasks on multiple datasets show that Sub-CP consistently improves performance across model scales.

Problem

Research questions and friction points this paper is trying to address.

Addresses quadratic input complexity limiting exemplars in transformers

Solves information redundancy and under-representation in context partitioning

Enables flexible context selection from globally diverse to locally coherent

Innovation

Methods, ideas, or system contributions that make the work stand out.

Submodular objectives control block diversity

Flexible selection strategies from diverse to coherent

Enables precomputation with fine-grained semantic control

🔎 Similar Papers

No similar papers found.