Can We Predict Before Executing Machine Learning Agents?

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 1

career value

192K/year

🤖 AI Summary

This work addresses the efficiency bottleneck in scientific discovery by machine learning agents caused by reliance on costly physical execution. To overcome this, the authors propose a prediction-first decision paradigm that internalizes execution priors, enabling large language models (LLMs) to perform rapid inference based on verified data analysis reports instead of time-consuming real-world trials. The study formalizes, for the first time, the data-driven solution preference task, constructs a large-scale pairwise comparison corpus, and introduces a prediction–verification loop mechanism alongside a world-model-inspired prediction architecture. This framework integrates Verified Data Analysis Report prompting with preference learning. Experiments demonstrate that the approach achieves a prediction accuracy of 61.5% with well-calibrated confidence, and the resulting FOREAGENT agent converges six times faster than baselines while outperforming execution-based methods by 6% in overall performance.

Technology Category

Application Category

📝 Abstract

Autonomous machine learning agents have revolutionized scientific discovery, yet they remain constrained by a Generate-Execute-Feedback paradigm. Previous approaches suffer from a severe Execution Bottleneck, as hypothesis evaluation relies strictly on expensive physical execution. To bypass these physical constraints, we internalize execution priors to substitute costly runtime checks with instantaneous predictive reasoning, drawing inspiration from World Models. In this work, we formalize the task of Data-centric Solution Preference and construct a comprehensive corpus of 18,438 pairwise comparisons. We demonstrate that LLMs exhibit significant predictive capabilities when primed with a Verified Data Analysis Report, achieving 61.5% accuracy and robust confidence calibration. Finally, we instantiate this framework in FOREAGENT, an agent that employs a Predict-then-Verify loop, achieving a 6x acceleration in convergence while surpassing execution-based baselines by +6%. Our code and dataset will be publicly available soon at https://github.com/zjunlp/predict-before-execute.

Problem

Research questions and friction points this paper is trying to address.

Execution Bottleneck

Predictive Reasoning

Machine Learning Agents

Hypothesis Evaluation

Data-centric Solution Preference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Predict-before-Execute

World Models

Data-centric Solution Preference