Scaling Long-Horizon LLM Agent via Context-Folding

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the fundamental challenge of context-length limitations hindering large language model (LLM) agents in long-horizon tasks, this paper proposes Context-Folding—a framework for dynamic, efficient context compression via learnable subtask decomposition and execution-path folding. Methodologically, it introduces FoldGRPO, the first end-to-end reinforcement learning algorithm that jointly optimizes task decomposition and folding policies using process-based rewards, moving beyond static, summary-driven compression. Evaluated on complex long-horizon benchmarks—including Deep Research and SWE—the approach matches or exceeds ReAct’s task performance while reducing active context length by up to 10×. This substantial compression significantly outperforms existing summarization-based context management techniques, demonstrating improved scalability and fidelity in extended reasoning trajectories.

Technology Category

Application Category

📝 Abstract
Large language model (LLM) agents are fundamentally constrained by context length on long-horizon tasks. We introduce Context-Folding, a framework that empowers agents to actively manage their working context. An agent can procedurally branch into a sub-trajectory to handle a subtask and then fold it upon completion, collapsing the intermediate steps while retaining a concise summary of the outcome. To make this behavior learnable, we develop an end-to-end reinforcement learning framework FoldGRPO with specific process rewards to encourage effective task decomposition and context management. On complex long-horizon tasks (Deep Research and SWE), our folding agent matches or outperforms the ReAct baselines while using an active context 10$ imes$ smaller and significantly outperforms models that rely on summarization-based context management.
Problem

Research questions and friction points this paper is trying to address.

Addresses context length constraints in long-horizon LLM agents
Enables active context management through procedural branching and folding
Improves task decomposition and efficiency with smaller active context
Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework actively manages working context
Procedurally branches and folds sub-trajectories
Reinforcement learning optimizes task decomposition
🔎 Similar Papers
No similar papers found.