🤖 AI Summary
LLM-based agents exhibit low success rates on complex real-world tasks; existing work predominantly focuses on agent-centric improvements while overlooking the critical role of the system environment. Method: This paper introduces, for the first time, a taxonomy of six failure modes in agent–environment interaction and proposes three environment-side enhancement techniques—enhanced environment observability, general-purpose computational offloading, and speculative agent actions—that require no modification to the agent or underlying LLM. Contribution/Results: Empirical evaluation across 142 agent trajectories (3,656 interaction turns) demonstrates that our methods improve task success rates by 6.7–12.5% on five mainstream agent benchmarks. These results substantially surpass the performance ceiling of agent-only optimization, establishing a plug-and-play, system-level paradigm for deploying LLM agents in practical settings.
📝 Abstract
Large Language Models (LLMs) agents augmented with domain tools promise to autonomously execute complex tasks requiring human-level intelligence, such as customer service and digital assistance. However, their practical deployment is often limited by their low success rates under complex real-world environments. To tackle this, prior research has primarily focused on improving the agents themselves, such as developing strong agentic LLMs, while overlooking the role of the system environment in which the agent operates.
In this paper, we study a complementary direction: improving agent success rates by optimizing the system environment in which the agent operates. We collect 142 agent traces (3,656 turns of agent-environment interactions) across 5 state-of-the-art agentic benchmarks. By analyzing these agent failures, we propose a taxonomy for agent-environment interaction failures that includes 6 failure modes. Guided by these findings, we design Aegis, a set of targeted environment optimizations: 1) environment observability enhancement, 2) common computation offloading, and 3) speculative agentic actions. These techniques improve agent success rates on average by 6.7-12.5%, without any modifications to the agent and underlying LLM.