Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Reanalysis

πŸ“… 2026-02-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes an AI agent–driven workflow to address the high costs of reproducing large-scale empirical studies, which often stem from discrepancies in computational environments, code, and documentation. The approach decouples scientific reasoning from computational execution: researchers supply standardized diagnostic templates, and the system automatically retrieves and orchestrates reproduction materials within a version-controlled environment. A structured knowledge layer captures failure patterns, enabling adaptive reproduction across heterogeneous studies while ensuring transparency and stability of the analytical pipeline. Evaluated on 92 instrumental variable studies, the method achieves an 87% end-to-end reproduction success rate; when data and code are available, it attains 100% success at both the paper and model levels.

Technology Category

Application Category

πŸ“ Abstract
Reproducibility is central to research credibility, yet large-scale reanalysis of empricial data remains costly because replication packages vary widely in structure, software environment, and documentation. We develop and evaluate an agentic AI workflow that addresses this execution bottleneck while preserving scientific rigor. The system separates scientific reasoning from computational execution: researchers design fixed diagnostic templates, and the workflow automates the acquisition, harmonization, and execution of replication materials using pre-specified, version-controlled code. A structured knowledge layer records resolved failure patterns, enabling adaptation across heterogeneous studies while keeping each pipeline version transparent and stable. We evaluate this workflow on 92 instrumental variable (IV) studies, including 67 with manually verified reproducible 2SLS estimates and 25 newly published IV studies under identical criteria. For each paper, we analyze up to three two-stage least squares (2SLS) specifications, totaling 215. Across the 92 papers, the system achieves 87% end-to-end success overall. Conditional on accessible data and code, reproducibility is 100% at both the paper and specification levels. The framework substantially lowers the cost of executing established empirical protocols and can be adapted in empirical settings where analytic templates and norms of transparency are well established.
Problem

Research questions and friction points this paper is trying to address.

reproducibility
large-scale reanalysis
replication packages
execution bottleneck
empirical data
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-assisted workflow
reproducibility
agentic AI
structured knowledge layer
version-controlled execution
πŸ”Ž Similar Papers
No similar papers found.
Yiqing Xu
Yiqing Xu
Department of Political Science, Stanford University
political methodologyapplied statisticscomparative politicspositive political economy
L
Leo Yang Yang
Department of Accountancy, Economics and Finance, School of Business, Hong Kong Baptist University, Kowloon, Hong Kong SAR