LightAutoDS-Tab: Multi-AutoML Agentic System for Tabular Data

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

190K/year
🤖 AI Summary
Existing AutoML systems for tabular data tasks suffer from limited flexibility and robustness due to over-reliance on a single underlying tool. This work proposes a collaborative multi-AutoML agent framework that integrates LLM-driven code generation, dynamic multi-engine scheduling, and task-feedback-guided iterative optimization to achieve end-to-end automated modeling. By decoupling the system from fixed tool dependencies, the architecture significantly enhances adaptability and fault tolerance in challenging scenarios—including heterogeneous data, noisy inputs, and small-sample regimes. Evaluated on multiple Kaggle benchmark tasks, our approach surpasses current open-source state-of-the-art methods, achieving an average performance gain of 3.2% while demonstrating superior generalization and stability. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
AutoML has advanced in handling complex tasks using the integration of LLMs, yet its efficiency remains limited by dependence on specific underlying tools. In this paper, we introduce LightAutoDS-Tab, a multi-AutoML agentic system for tasks with tabular data, which combines an LLM-based code generation with several AutoML tools. Our approach improves the flexibility and robustness of pipeline design, outperforming state-of-the-art open-source solutions on several data science tasks from Kaggle. The code of LightAutoDS-Tab is available in the open repository https://github.com/sb-ai-lab/LADS
Problem

Research questions and friction points this paper is trying to address.

Enhances AutoML flexibility for tabular data tasks
Integrates LLM-based code generation with AutoML tools
Improves pipeline design robustness over existing solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based code generation for AutoML
Integration of multiple AutoML tools
Enhanced pipeline flexibility and robustness
🔎 Similar Papers
💼 Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
A
Aleksey Lapin
ITMO University
I
Igor Hromov
Sber AI Lab
S
Stanislav Chumakov
ITMO University
Mile Mitrovic
Mile Mitrovic
Sber AI Lab
Machine LearningDeep LearningOptimizationAlgorithms
Dmitry Simakov
Dmitry Simakov
Sber AI Lab
data science
N
Nikolay O. Nikitin
ITMO University
A
Andrey V. Savchenko
Sber AI Lab