Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Large language models (LLMs) exhibit weak multi-step planning capabilities in interactive environments, struggle to adapt online to new information, and heavily rely on lengthy historical prompts. To address these limitations, this paper proposes a fine-tuning-free online learning framework. Methodologically, it introduces (1) a novel dynamic extraction and injection mechanism for atomic facts, enabling structured experience representation; and (2) an LLM-based latent world model integrated with depth-limited recursive forward search and state-value estimation, facilitating real-time planning optimization. Crucially, the framework requires no parameter updates—adaptation is achieved solely through in-context learning. Evaluated on complex interactive benchmarks—including TextFrozenLake and ALFWorld—the approach significantly improves task success rates and policy optimality, demonstrating both efficiency and strong generalization across diverse environments.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are increasingly capable but often require significant guidance or extensive interaction history to perform effectively in complex, interactive environments. Existing methods may struggle with adapting to new information or efficiently utilizing past experiences for multi-step reasoning without fine-tuning. We introduce a novel LLM agent framework that enhances planning capabilities through in-context learning, facilitated by atomic fact augmentation and a recursive lookahead search. Our agent learns to extract task-critical ``atomic facts'' from its interaction trajectories. These facts dynamically augment the prompts provided to LLM-based components responsible for action proposal, latent world model simulation, and state-value estimation. Planning is performed via a depth-limited lookahead search, where the LLM simulates potential trajectories and evaluates their outcomes, guided by the accumulated facts and interaction history. This approach allows the agent to improve its understanding and decision-making online, leveraging its experience to refine its behavior without weight updates. We provide a theoretical motivation linking performance to the quality of fact-based abstraction and LLM simulation accuracy. Empirically, our agent demonstrates improved performance and adaptability on challenging interactive tasks, achieving more optimal behavior as it accumulates experience, showcased in tasks such as TextFrozenLake and ALFWorld.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM planning via atomic facts and lookahead search

Improving multi-step reasoning without fine-tuning

Boosting adaptability in interactive environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Atomic fact augmentation enhances prompt context

Recursive lookahead search improves multi-step planning

In-context learning refines decisions without weight updates

🔎 Similar Papers

Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments