SciFig: Towards Automating Scientific Figure Generation

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This study addresses the labor-intensive and expertise-dependent process of generating high-quality flowcharts from scientific papers. To overcome this challenge, the authors propose the first end-to-end AI agent system capable of automatically constructing publication-ready diagrams directly from paper text. The approach innovatively integrates hierarchical module layout generation, functional clustering, and connection modeling, augmented by an iterative chain-of-thought (CoT) visual reasoning mechanism and a rule-based automated evaluation framework. Evaluated on both general and paper-specific datasets, the system achieves overall quality scores of 70.1% and 66.2%, respectively, demonstrating strong performance in visual clarity, structural organization, and scientific accuracy.

Technology Category

Application Category

📝 Abstract

Creating high-quality figures and visualizations for scientific papers is a time-consuming task that requires both deep domain knowledge and professional design skills. Despite over 2.5 million scientific papers published annually, the figure generation process remains largely manual. We introduce $\textbf{SciFig}$, an end-to-end AI agent system that generates publication-ready pipeline figures directly from research paper texts. SciFig uses a hierarchical layout generation strategy, which parses research descriptions to identify component relationships, groups related elements into functional modules, and generates inter-module connections to establish visual organization. Furthermore, an iterative chain-of-thought (CoT) feedback mechanism progressively improves layouts through multiple rounds of visual analysis and reasoning. We introduce a rubric-based evaluation framework that analyzes 2,219 real scientific figures to extract evaluation rubrics and automatically generates comprehensive evaluation criteria. SciFig demonstrates remarkable performance: achieving 70.1$\%$ overall quality on dataset-level evaluation and 66.2$\%$ on paper-specific evaluation, and consistently high scores across metrics such as visual clarity, structural organization, and scientific accuracy. SciFig figure generation pipeline and our evaluation benchmark will be open-sourced.

Problem

Research questions and friction points this paper is trying to address.

scientific figure generation

visualization

automated design

publication-ready figures

manual process

Innovation

Methods, ideas, or system contributions that make the work stand out.

scientific figure generation

hierarchical layout generation

iterative chain-of-thought feedback