๐ค AI Summary
Existing LLM-driven AutoML agents suffer from low code-generation diversity and suboptimal node selection due to reliance on scalar feedback. To address these issues, we propose Introspective Monte Carlo Tree Search (Introspective MCTS), a novel framework featuring: (i) a reflective node expansion mechanism that leverages parent- and sibling-node feedbackโfirst of its kind; (ii) an LLM-based value model for proactive evaluation of the solution space prior to execution; and (iii) a hybrid reward function integrating LLM-generated scores with ground-truth performance metrics to smooth search guidance. Evaluated across diverse machine learning tasks, our framework significantly improves exploration quality and generalization capability. On mainstream open-source Agentic AutoML benchmarks, it achieves a 6% absolute performance gain. This work establishes a new, interpretable, and iterative decision-optimization paradigm for LLM-based AutoML.
๐ Abstract
Recent advancements in large language models (LLMs) have shown remarkable potential in automating machine learning tasks. However, existing LLM-based agents often struggle with low-diversity and suboptimal code generation. While recent work has introduced Monte Carlo Tree Search (MCTS) to address these issues, limitations persist in the quality and diversity of thoughts generated, as well as in the scalar value feedback mechanisms used for node selection. In this study, we introduce Introspective Monte Carlo Tree Search (I-MCTS), a novel approach that iteratively expands tree nodes through an introspective process that meticulously analyzes solutions and results from parent and sibling nodes. This facilitates a continuous refinement of the node in the search tree, thereby enhancing the overall decision-making process.Furthermore, we integrate a Large Language Model (LLM)-based value model to facilitate direct evaluation of each node's solution prior to conducting comprehensive computational rollouts. A hybrid rewarding mechanism is implemented to seamlessly transition the Q-value from LLM-estimated scores to actual performance scores. This allows higher-quality nodes to be traversed earlier.Applied to the various ML tasks, our approach demonstrates a6% absolute improvement in performance compared to the strong open-source AutoML agents, showcasing its effectiveness in enhancing agentic AutoML systems.