AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of current AI research, which heavily relies on labor-intensive manual reproduction and tuning to surpass state-of-the-art (SOTA) performance. The authors propose the first end-to-end automated research system, featuring a collaborative architecture of eight agents that orchestrates a closed-loop pipeline encompassing paper reproduction, environment setup, experiment tracking, and model innovation. This framework supports high-level optimizations such as architectural modifications and algorithmic redesign. Evaluated across eight top-tier conference papers, the system successfully reproduced all benchmarks and autonomously discovered 105 new SOTA models within an average of approximately five hours per paper, spanning domains including large language models, computer vision, and natural language processing—demonstrating, for the first time, a fully automated transition from reproduction to outperforming existing SOTA results.
📝 Abstract
Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating a growing need for systems that can accelerate the full pipeline of empirical model optimization. In this work, we introduce AutoSOTA, an end-to-end automated research system that advances the latest SOTA models published in top-tier AI papers to reproducible and empirically improved new SOTA models. We formulate this problem through three tightly coupled stages: resource preparation and goal setting; experiment evaluation; and reflection and ideation. To tackle this problem, AutoSOTA adopts a multi-agent architecture with eight specialized agents that collaboratively ground papers to code and dependencies, initialize and repair execution environments, track long-horizon experiments, generate and schedule optimization ideas, and supervise validity to avoid spurious gains. We evaluate AutoSOTA on recent research papers collected from eight top-tier AI conferences under filters for code availability and execution cost. Across these papers, AutoSOTA achieves strong end-to-end performance in both automated replication and subsequent optimization. Specifically, it successfully discovers 105 new SOTA models that surpass the original reported methods, averaging approximately five hours per paper. Case studies spanning LLM, NLP, computer vision, time series, and optimization further show that the system can move beyond routine hyperparameter tuning to identify architectural innovation, algorithmic redesigns, and workflow-level improvements. These results suggest that end-to-end research automation can serve not only as a performance optimizer, but also as a new form of research infrastructure that reduces repetitive experimental burden and helps redirect human attention toward higher-level scientific creativity.
Problem

Research questions and friction points this paper is trying to address.

State-of-the-Art
AI model discovery
automated research
empirical optimization
reproducibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

AutoSOTA
automated research system
multi-agent architecture
SOTA model discovery
end-to-end optimization
🔎 Similar Papers
No similar papers found.
Y
Yu Li
Department of Electronic Engineering, BNRist, Tsinghua University
Chenyang Shao
Chenyang Shao
PhD student, EE, Tsinghua University
Large Language ModelLLM AgentRL
X
Xinyang Liu
Department of Electronic Engineering, BNRist, Tsinghua University
R
Ruotong Zhao
Department of Electronic Engineering, BNRist, Tsinghua University
P
Peijie Liu
Department of Electronic Engineering, BNRist, Tsinghua University
H
Hongyuan Su
Department of Electronic Engineering, BNRist, Tsinghua University; Zhongguancun Academy
Zhibin Chen
Zhibin Chen
Assistant Professor of Engineering, New York University Shanghai
Transportation Network Modeling & OptimizationTransportation Economics
Q
Qinglong Yang
Department of Electronic Engineering, BNRist, Tsinghua University
A
Anjie Xu
Zhongguancun Academy; Peking University
Y
Yi Fang
Zhongguancun Academy; University of Science and Technology of China
Q
Qingbin Zeng
Department of Electronic Engineering, BNRist, Tsinghua University
Tianxing Li
Tianxing Li
Assistant Professor, Michigan State University
Mobile SensingWireless NetworkVisible Light Communication
Jingbo Xu
Jingbo Xu
Alibaba Group
graph computinggraph datadata managementdatabaseLLM
Fengli Xu
Fengli Xu
Tsinghua University
LLM AgentData ScienceSocial ComputingScience of ScienceUrban Science
Yong Li
Yong Li
Professor, Electronic Engineering, Tsinghua University
Urban ScienceData MiningAI for Science
Tie-Yan Liu
Tie-Yan Liu
President, Zhongguancun Academy | IEEE Fellow | ACM Fellow | AAIA Fellow
Machine learningAI for ScienceAI for IndustryInformation retrievalNLP