🤖 AI Summary
To address large prediction errors in chemical tools and limited effectiveness of LLM-based agents in tool invocation, this paper proposes the Hierarchical Agent Stacking (HAS) framework—the first to enable LLM agents to actively perceive and inversely correct tool errors. Methodologically, HAS integrates structured fine-tuning for tool error awareness, few-shot-driven agent topology optimization, and multi-layer agent collaborative orchestration. We empirically identify and formally characterize four interpretable agent stacking behavioral patterns. On four foundational chemical tasks—including molecular property prediction and reaction condition recommendation—HAS achieves state-of-the-art performance. Tool invocation accuracy improves by 12.7–28.3%, and scientific reasoning reliability is significantly enhanced. The code and datasets are publicly available.
📝 Abstract
Large Language Model (LLM)-based agents have demonstrated the ability to improve performance in chemistry-related tasks by selecting appropriate tools. However, their effectiveness remains limited by the inherent prediction errors of chemistry tools. In this paper, we take a step further by exploring how LLMbased agents can, in turn, be leveraged to reduce prediction errors of the tools. To this end, we propose ChemHAS (Chemical Hierarchical Agent Stacking), a simple yet effective method that enhances chemistry tools through optimizing agent-stacking structures from limited data. ChemHAS achieves state-of-the-art performance across four fundamental chemistry tasks, demonstrating that our method can effectively compensate for prediction errors of the tools. Furthermore, we identify and characterize four distinct agent-stacking behaviors, potentially improving interpretability and revealing new possibilities for AI agent applications in scientific research. Our code and dataset are publicly available at https: //anonymous.4open.science/r/ChemHAS-01E4/README.md.