๐ค AI Summary
This work addresses the challenge of reproducing academic papers through automated code generation, which is often hindered by the absence of tacit knowledgeโsuch as implementation nuances and debugging insights. The study presents the first systematic formalization of three types of tacit knowledge: relational, embodied, and collective. To recover this knowledge, the authors propose a graph-based agent framework that operates through a three-stage mechanism: relation-aware aggregation, execution-feedback refinement, and graph-level knowledge induction. They also introduce an expanded version of ReplicateBench, a large-scale evaluation benchmark encompassing three domains, ten tasks, and forty papers. Experimental results demonstrate that the generated code achieves an average performance gap of only 10.04% compared to official implementations, representing a 24.68% improvement over the strongest baseline.
๐ Abstract
Automated paper reproduction -- generating executable code from academic papers -- is bottlenecked not by information retrieval but by the tacit knowledge that papers inevitably leave implicit. We formalize this challenge as the progressive recovery of three types of tacit knowledge -- relational, somatic, and collective -- and propose \method, a graph-based agent framework with a dedicated mechanism for each: node-level relation-aware aggregation recovers relational knowledge by analyzing implementation-unit-level reuse and adaptation relationships between the target paper and its citation neighbors; execution-feedback refinement recovers somatic knowledge through iterative debugging driven by runtime signals; and graph-level knowledge induction distills collective knowledge from clusters of papers sharing similar implementations. On an extended ReproduceBench spanning 3 domains, 10 tasks, and 40 recent papers, \method{} achieves an average performance gap of 10.04\% against official implementations, improving over the strongest baseline by 24.68\%. The code will be publicly released upon acceptance; the repository link will be provided in the final version.