🤖 AI Summary
This work addresses critical limitations of monolithic agents in dynamic environments—namely, planning failure, API field errors, hallucination, and tool-output omissions. We propose a novel dual-model architecture that explicitly decouples planning from tool-memory management: a large language model (LLM) handles high-level reasoning and dynamic in-context learning, while a small language model (SLM) specializes in retaining structured tool schemas and performing output validation. Our approach is the first to empirically identify and exploit the intrinsic trade-off between in-context learning capability and static memory stability. Integrated with dynamic prompt engineering, structured tool-call constraints, and an error-diagnosis mechanism, the framework significantly enhances robustness across multi-tool tasks: API field error rates decrease by 42%, and task success rates in dynamic environments improve by 31%.
📝 Abstract
In this paper, we propose a novel factored agent architecture designed to overcome the limitations of traditional single-agent systems in agentic AI. Our approach decomposes the agent into two specialized components: (1) a large language model (LLM) that serves as a high level planner and in-context learner, which may use dynamically available information in user prompts, (2) a smaller language model which acts as a memorizer of tool format and output. This decoupling addresses prevalent issues in monolithic designs, including malformed, missing, and hallucinated API fields, as well as suboptimal planning in dynamic environments. Empirical evaluations demonstrate that our factored architecture significantly improves planning accuracy and error resilience, while elucidating the inherent trade-off between in-context learning and static memorization. These findings suggest that a factored approach is a promising pathway for developing more robust and adaptable agentic AI systems.