MLZero: A Multi-Agent System for End-to-end Machine Learning Automation

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AutoML systems struggle with end-to-end automation in multimodal scenarios, relying heavily on manual configuration. This paper introduces the first end-to-end AutoML framework for multimodal data, powered by a large language model (LLM)-driven multi-agent system. It innovatively integrates a cognition-aware module with a dual-memory mechanism—comprising semantic and episodic memory—to mitigate LLM hallucination and API knowledge staleness. The framework fully automates data understanding, model selection, code generation, and optimization from minimal user input. Evaluated on MLE-Bench Lite, it achieves six gold medals. On a custom 25-task multimodal AutoML benchmark, it attains a 92% success rate—surpassing the second-best system by 263.6%—while employing only an 8B-parameter LLM, outperforming existing full-scale systems.

Technology Category

Application Category

📝 Abstract
Existing AutoML systems have advanced the automation of machine learning (ML); however, they still require substantial manual configuration and expert input, particularly when handling multimodal data. We introduce MLZero, a novel multi-agent framework powered by Large Language Models (LLMs) that enables end-to-end ML automation across diverse data modalities with minimal human intervention. A cognitive perception module is first employed, transforming raw multimodal inputs into perceptual context that effectively guides the subsequent workflow. To address key limitations of LLMs, such as hallucinated code generation and outdated API knowledge, we enhance the iterative code generation process with semantic and episodic memory. MLZero demonstrates superior performance on MLE-Bench Lite, outperforming all competitors in both success rate and solution quality, securing six gold medals. Additionally, when evaluated on our Multimodal AutoML Agent Benchmark, which includes 25 more challenging tasks spanning diverse data modalities, MLZero outperforms the competing methods by a large margin with a success rate of 0.92 (+263.6%) and an average rank of 2.28. Our approach maintains its robust effectiveness even with a compact 8B LLM, outperforming full-size systems from existing solutions.
Problem

Research questions and friction points this paper is trying to address.

Automating end-to-end ML with minimal human intervention
Handling multimodal data without manual configuration
Improving LLM code generation accuracy and API knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM framework for end-to-end ML automation
Cognitive perception module for multimodal data processing
Enhanced code generation with semantic and episodic memory
🔎 Similar Papers
No similar papers found.