🤖 AI Summary
Current automated data science systems face performance bottlenecks and heavy reliance on domain expertise, with existing approaches struggling to balance efficiency and accuracy. To address this, we propose the first research–development dual-agent collaborative framework: a Researcher agent generates improvement strategies based on performance feedback, while a Developer agent iteratively refines code guided by error signals; the two agents coordinate dynamically via dual closed-loop interaction, enabling multi-path parallel exploration, dynamic trajectory fusion, and result aggregation. Built upon large language models (LLMs), the framework integrates feedback-driven code generation, correction, and search-enhanced optimization. Evaluated on the MLE-Bench benchmark, it achieves state-of-the-art performance and ranks first on the Machine Learning Engineering Agent Leaderboard. Open-sourced implementation demonstrates strong cross-task generalization and practical engineering applicability.
📝 Abstract
Recent advances in AI and ML have transformed data science, yet increasing complexity and expertise requirements continue to hinder progress. While crowdsourcing platforms alleviate some challenges, high-level data science tasks remain labor-intensive and iterative. To overcome these limitations, we introduce R&D-Agent, a dual-agent framework for iterative exploration. The Researcher agent uses performance feedback to generate ideas, while the Developer agent refines code based on error feedback. By enabling multiple parallel exploration traces that merge and enhance one another, R&D-Agent narrows the gap between automated solutions and expert-level performance. Evaluated on MLE-Bench, R&D-Agent emerges as the top-performing machine learning engineering agent, demonstrating its potential to accelerate innovation and improve precision across diverse data science applications. We have open-sourced R&D-Agent on GitHub: https://github.com/microsoft/RD-Agent.