Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data

📅 2026-04-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
Existing automated feature generation methods either rely on predefined operator libraries that overlook task semantics or are constrained by the fixed generation patterns and lack of feedback mechanisms in large language models, making it difficult to balance feature diversity and effectiveness. This work proposes MALMAS, a novel system that introduces memory-augmented mechanisms into a multi-agent LLM framework for the first time. By integrating role-specialized agents, dynamic routing, and three types of memory modules—programmatic, feedback, and conceptual—MALMAS enables semantic-driven, iterative feature generation and refinement. Experimental results demonstrate that MALMAS significantly outperforms existing approaches across multiple public datasets, consistently enhancing feature quality, diversity, and downstream task performance.

Technology Category

Application Category

📝 Abstract
Automated feature generation extracts informative features from raw tabular data without manual intervention and is crucial for accurate, generalizable machine learning. Traditional methods rely on predefined operator libraries and cannot leverage task semantics, limiting their ability to produce diverse, high-value features for complex tasks. Recent Large Language Model (LLM)-based approaches introduce richer semantic signals, but still suffer from a restricted feature space due to fixed generation patterns and from the absence of feedback from the learning objective. To address these challenges, we propose a Memory-Augmented LLM-based Multi-Agent System (\textbf{MALMAS}) for automated feature generation. MALMAS decomposes the generation process into agents with distinct responsibilities, and a Router Agent activates an appropriate subset of agents per iteration, further broadening exploration of the feature space. We further integrate a memory module comprising procedural memory, feedback memory, and conceptual memory, enabling iterative refinement that adaptively guides subsequent feature generation and improves feature quality and diversity. Extensive experiments on multiple public datasets against state-of-the-art baselines demonstrate the effectiveness of our approach. The code is available at https://github.com/fxdong24/MALMAS
Problem

Research questions and friction points this paper is trying to address.

automated feature generation
tabular data
Large Language Model
feature space
task semantics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-Augmented
Multi-Agent System
Automated Feature Generation
Large Language Model
Tabular Data