🤖 AI Summary
This study addresses the labor-intensive and expertise-dependent process of generating high-quality flowcharts from scientific papers. To overcome this challenge, the authors propose the first end-to-end AI agent system capable of automatically constructing publication-ready diagrams directly from paper text. The approach innovatively integrates hierarchical module layout generation, functional clustering, and connection modeling, augmented by an iterative chain-of-thought (CoT) visual reasoning mechanism and a rule-based automated evaluation framework. Evaluated on both general and paper-specific datasets, the system achieves overall quality scores of 70.1% and 66.2%, respectively, demonstrating strong performance in visual clarity, structural organization, and scientific accuracy.
📝 Abstract
Creating high-quality figures and visualizations for scientific papers is a time-consuming task that requires both deep domain knowledge and professional design skills. Despite over 2.5 million scientific papers published annually, the figure generation process remains largely manual. We introduce $\textbf{SciFig}$, an end-to-end AI agent system that generates publication-ready pipeline figures directly from research paper texts. SciFig uses a hierarchical layout generation strategy, which parses research descriptions to identify component relationships, groups related elements into functional modules, and generates inter-module connections to establish visual organization. Furthermore, an iterative chain-of-thought (CoT) feedback mechanism progressively improves layouts through multiple rounds of visual analysis and reasoning. We introduce a rubric-based evaluation framework that analyzes 2,219 real scientific figures to extract evaluation rubrics and automatically generates comprehensive evaluation criteria. SciFig demonstrates remarkable performance: achieving 70.1$\%$ overall quality on dataset-level evaluation and 66.2$\%$ on paper-specific evaluation, and consistently high scores across metrics such as visual clarity, structural organization, and scientific accuracy. SciFig figure generation pipeline and our evaluation benchmark will be open-sourced.