ACON: Optimizing Context Compression for Long-horizon LLM Agents

📅 2025-10-01

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address context inflation, escalating inference costs, and declining efficiency in LLM-based agents performing long-horizon tasks—caused by unbounded accumulation of action-observation histories—this paper proposes the first end-to-end context compression framework tailored for multi-step agent tasks. The framework jointly optimizes for natural-language readability and task fidelity, introducing a novel failure-trajectory-driven compression criterion with self-optimization capability, refined iteratively via large-model feedback. It further employs knowledge distillation to transfer the learned compression policy to lightweight models. Experiments across multiple benchmarks demonstrate peak token consumption reductions of 26%–54% while maintaining ≥95% task accuracy; distilled lightweight agent variants achieve up to 46% performance improvement over baselines.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly deployed as agents in dynamic, real-world environments, where success requires both reasoning and effective tool use. A central challenge for agentic tasks is the growing context length, as agents must accumulate long histories of actions and observations. This expansion raises costs and reduces efficiency in long-horizon tasks, yet prior work on context compression has mostly focused on single-step tasks or narrow applications. We introduce Agent Context Optimization (ACON), a unified framework that optimally compresses both environment observations and interaction histories into concise yet informative condensations. ACON leverages compression guideline optimization in natural language space: given paired trajectories where full context succeeds but compressed context fails, capable LLMs analyze the causes of failure, and the compression guideline is updated accordingly. Furthermore, we propose distilling the optimized LLM compressor into smaller models to reduce the overhead of the additional module. Experiments on AppWorld, OfficeBench, and Multi-objective QA show that ACON reduces memory usage by 26-54% (peak tokens) while largely preserving task performance, preserves over 95% of accuracy when distilled into smaller compressors, and enhances smaller LMs as long-horizon agents with up to 46% performance improvement.

Problem

Research questions and friction points this paper is trying to address.

Optimizing context compression for long-horizon LLM agents in dynamic environments

Addressing growing context length challenges in agentic tasks with accumulated histories

Reducing memory usage and costs while preserving task performance efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compresses observations and histories into concise condensations

Optimizes compression guidelines through failure analysis

Distills optimized compressors into smaller models

🔎 Similar Papers

No similar papers found.