Enhancing Automated Paper Reproduction via Prompt-Free Collaborative Agents

📅 2025-12-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing automated paper reproduction frameworks lack built-in mechanisms for automatic validation and optimization during code generation, or rely excessively on manually engineered prompts, limiting their adaptability and scalability. To address this, we propose a prompt-free, dual-agent collaborative framework: a verification agent autonomously identifies output defects based solely on the original system prompt, while an optimization agent iteratively refines the code generation process accordingly—requiring no human intervention. Integrated into the Paper2Code system, our framework achieves ~15% and ~13% performance gains on Code-Dev and Paper2CodeBench, respectively, substantially outperforming baseline methods and Self-Refine. This work marks the first end-to-end, prompt-agnostic pipeline for paper-to-executable-code reproduction with integrated self-validation.

Technology Category

Application Category

📝 Abstract
Automated paper reproduction has emerged as a promising approach to accelerate scientific research, employing multi-step workflow frameworks to systematically convert academic papers into executable code. However, existing frameworks often lack mechanisms to verify and refine the outputs at each generation step, or rely heavily on manually designed prompts for self-refinement, which limits their adaptability and scalability. To address these limitations, we propose a prompt-free collaborative agent framework that automatically enhances the quality of paper-to-code generation. Our approach employs two collaborative agents: a verification agent that examines whether the outputs at each step satisfy the requirements specified in the corresponding system prompt, and a refinement agent that revises the outputs based on the identified issues. Unlike previous methods that require human experts to craft specific refinement prompts for each step, our framework achieves automatic verification and improvement by leveraging only the original system prompts. We integrate our collaborative agents into the Paper2Code framework and conduct comprehensive experiments on PaperBench Code-Dev and Paper2CodeBench datasets. Experimental results demonstrate that our approach significantly improves the accuracy and completeness of reproduced code, achieving performance gains of approximately 15% and 13%, respectively, compared to the baseline without our agents. Furthermore, comparative experiments against Self-Refine validate the robustness and consistency of our prompt-free approach across different datasets.
Problem

Research questions and friction points this paper is trying to address.

Automates verification and refinement in paper-to-code generation.
Eliminates reliance on manually crafted prompts for each step.
Improves accuracy and completeness of reproduced scientific code.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collaborative agents verify outputs automatically
Refinement agent revises code without manual prompts
Leverages original system prompts for quality enhancement
🔎 Similar Papers
No similar papers found.
Z
Zijie Lin
University of Science and Technology of China, Hefei, China
Q
Qilin Cai
University of Science and Technology of China, Hefei, China
L
Liang Shen
Meituan, Beijing, China
Mingjun Xiao
Mingjun Xiao
University of Science and Technology of China
Mobile ComputingCrowdsensingMobile Social NetworkVechular Network