Scientific Graphics Program Synthesis via Dual Self-Consistency Reinforcement Learning

πŸ“… 2026-04-07
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge in scientific diagram program synthesis where poor data quality and the absence of a comprehensive evaluation framework hinder models from generating executable TikZ code that is both visually accurate and structurally coherent. To overcome this, the authors propose the SciTikZ framework, which introduces SciTikZ-230Kβ€”a high-quality dataset comprising 230,000 samplesβ€”and SciTikZ-Bench, a multidimensional benchmark spanning 11 scientific disciplines. They further develop a novel dual self-consistency reinforcement learning approach, integrating an execution-driven data engine with a round-trip validation mechanism to enhance generation fidelity. The resulting SciTikZer-8B model significantly outperforms state-of-the-art models such as Gemini-2.5-Pro and Qwen3-VL in both visual fidelity and structural logic, achieving new state-of-the-art performance.
πŸ“ Abstract
Graphics Program Synthesis is pivotal for interpreting and editing visual data, effectively facilitating the reverse-engineering of static visuals into editable TikZ code. While TikZ is the de facto standard for scientific schematics due to its programmatic flexibility, its requirement for rigorous spatial precision presents a significant challenge for Multimodal Large Language Models. Progress is currently stifled by two primary gaps: (1) Data Quality Gap: existing image-TikZ corpora often lack strict executability and reliable visual alignment; (2) Evaluation Gap: a lack of benchmarks for both structural and visual fidelity. To address these, we present a closed-loop framework featuring: SciTikZ-230K, a large-scale, high-quality dataset from our Execution-Centric Data Engine covering 11 diverse scientific disciplines; SciTikZ-Bench, a multifaceted benchmark spanning from basic geometric constructs to intricate hierarchical schematics to evaluate both visual fidelity and structural logic. To further broaden the scope of visual-code optimization methodology, we introduce a novel Dual Self-Consistency Reinforcement Learning optimization paradigm, which utilizes Round-Trip Verification to penalize degenerate code and boost overall self-consistency. Empowered by these, our trained model SciTikZer-8B achieves state-of-the-art performance, consistently outperforming proprietary giants like Gemini-2.5-Pro and massive models like Qwen3-VL-235B-A22B-Instruct.
Problem

Research questions and friction points this paper is trying to address.

Graphics Program Synthesis
Data Quality Gap
Evaluation Gap
TikZ
Scientific Schematics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graphics Program Synthesis
Dual Self-Consistency Reinforcement Learning
Execution-Centric Data Engine
SciTikZ-Bench
Round-Trip Verification
πŸ”Ž Similar Papers
No similar papers found.
J
Juekai Lin
Zhejiang University, Shanghai Artificial Intelligence Laboratory, OpenDataLab
Y
Yun Zhu
Shanghai Artificial Intelligence Laboratory, OpenDataLab
Honglin Lin
Honglin Lin
SJTU
Sijing Li
Sijing Li
zhejiang university
MLLM
Tianwei Lin
Tianwei Lin
Zhejiang University
MLLMs
Z
Zheng Liu
Shanghai Artificial Intelligence Laboratory, OpenDataLab, Peking University
Xiaoyang Wang
Xiaoyang Wang
Lecturer in Artificial Intelligence, University of Exeter
Reinforcement LearningSignal Processing6G CommunicationsComputer Vision
W
Wenqiao Zhang
Zhejiang University
Lijun Wu
Lijun Wu
Shanghai AI Laboratory
MLLLMAI4Science