🤖 AI Summary
This work addresses the challenge that existing machine learning engineering agents struggle to effectively enhance algorithmic performance through iterative optimization. To overcome this limitation, the authors propose MLE-Ideator, a dual-agent framework that decouples idea generation from algorithm implementation: the implementation agent can solicit strategic suggestions from a dedicated Ideator agent. The study introduces, for the first time, a trainable Ideator whose policy-generation capability is optimized via reinforcement learning, establishing a novel paradigm for AI systems aimed at scientific discovery. Evaluated on the MLE-Bench benchmark using the Qwen3-8B model, the approach significantly outperforms baseline methods in zero-shot settings. Moreover, after fine-tuning with only 1K samples, the Ideator yields an 11.5% relative performance improvement, surpassing Claude Sonnet 3.5.
📝 Abstract
Existing machine learning engineering (MLE) agents struggle to iteratively optimize their implemented algorithms for effectiveness. To address this, we introduce MLE-Ideator, a dual-agent framework that separates ideation from implementation. In our system, an implementation agent can request strategic help from a dedicated Ideator. We show this approach is effective in two ways. First, in a training-free setup, our framework significantly outperforms implementation-only agent baselines on MLE-Bench. Second, we demonstrate that the Ideator can be trained with reinforcement learning (RL) to generate more effective ideas. With only 1K training samples from 10 MLE tasks, our RL-trained Qwen3-8B Ideator achieves an 11.5% relative improvement compared to its untrained counterpart and surpasses Claude Sonnet 3.5. These results highlights a promising path toward training strategic AI systems for scientific discovery.