Decocted Experience Improves Test-Time Inference in LLM Agents

📅 2026-04-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of effectively leveraging test-time computation to enhance the performance of large language models on complex reasoning and agent-based tasks without updating model parameters. The authors propose a “refined experience” mechanism that extracts, structures, and retrieves critical information from historical interactions to construct high-quality contextual prompts that guide reasoning. By dynamically incorporating such experience-driven context, the method significantly improves model performance across diverse domains—including mathematical reasoning, web navigation, and software engineering—while simultaneously optimizing computational efficiency. The results demonstrate that strategically curating contextual information from past experiences is pivotal for augmenting test-time reasoning capabilities in large language models.
📝 Abstract
There is growing interest in improving LLMs without updating model parameters. One well-established direction is test-time scaling, where increased inference-time computation (e.g., longer reasoning, sampling, or search) is used to improve performance. However, for complex reasoning and agentic tasks, naively scaling test-time compute can substantially increase cost and still lead to wasted budget on suboptimal exploration. In this paper, we explore \emph{context} as a complementary scaling axis for improving LLM performance, and systematically study how to construct better inputs that guide reasoning through \emph{experience}. We show that effective context construction critically depends on \emph{decocted experience}. We present a detailed analysis of experience-augmented agents, studying how to derive context from experience, how performance scales with accumulated experience, what characterizes good context, and which data structures best support context construction. We identify \emph{decocted experience} as a key mechanism for effective context construction: extracting essence from experience, organizing it coherently, and retrieving salient information to build effective context. We validate our findings across reasoning and agentic tasks, including math reasoning, web browsing, and software engineering.
Problem

Research questions and friction points this paper is trying to address.

test-time inference
LLM agents
experience augmentation
context construction
reasoning tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

decocted experience
test-time inference
context construction
LLM agents
experience-augmented reasoning
🔎 Similar Papers
No similar papers found.