🤖 AI Summary
Existing AutoML systems for tabular data tasks suffer from limited flexibility and robustness due to over-reliance on a single underlying tool. This work proposes a collaborative multi-AutoML agent framework that integrates LLM-driven code generation, dynamic multi-engine scheduling, and task-feedback-guided iterative optimization to achieve end-to-end automated modeling. By decoupling the system from fixed tool dependencies, the architecture significantly enhances adaptability and fault tolerance in challenging scenarios—including heterogeneous data, noisy inputs, and small-sample regimes. Evaluated on multiple Kaggle benchmark tasks, our approach surpasses current open-source state-of-the-art methods, achieving an average performance gain of 3.2% while demonstrating superior generalization and stability. The implementation is publicly available.
📝 Abstract
AutoML has advanced in handling complex tasks using the integration of LLMs, yet its efficiency remains limited by dependence on specific underlying tools. In this paper, we introduce LightAutoDS-Tab, a multi-AutoML agentic system for tasks with tabular data, which combines an LLM-based code generation with several AutoML tools. Our approach improves the flexibility and robustness of pipeline design, outperforming state-of-the-art open-source solutions on several data science tasks from Kaggle. The code of LightAutoDS-Tab is available in the open repository https://github.com/sb-ai-lab/LADS