🤖 AI Summary
Existing chart editing benchmarks suffer from limited data diversity and reliance on complete source code, failing to reflect real-world scenarios where only chart images and natural language instructions are provided. To address this, we propose ChartEditVista—the first large-scale, source-code-free benchmark featuring diverse chart types and fine-grained editing instructions. We introduce a dual-axis evaluation framework assessing layout consistency and textual accuracy, and design a differentiable rendering reward mechanism to jointly optimize code executability and visual fidelity. Methodologically, our approach integrates automated data generation, multi-stage natural language understanding, reinforcement learning, and differentiable rendering for end-to-end image-to-code robust chart editing. Experiments demonstrate that our method significantly outperforms baselines across models of comparable and larger parameter counts; human evaluation further confirms superior editing accuracy and visual quality.
📝 Abstract
Chart editing reduces manual effort in visualization design. Typical benchmarks limited in data diversity and assume access to complete chart code, which is seldom in real-world scenarios. To address this gap, we present ChartEditVista, a comprehensive benchmark consisting of 7,964 samples spanning 31 chart categories. It encompasses diverse editing instructions and covers nearly all editable chart elements. The inputs in ChartEditVista include only the original chart image and natural language editing instructions, without the original chart codes. ChartEditVista is generated through a fully automated pipeline that produces, edits, and verifies charts, ensuring high-quality chart editing data. Besides, we introduce two novel fine-grained, rule-based evaluation metrics: the layout metric, which evaluates the position, size and color of graphical components; and the text metric, which jointly assesses textual content and font styling. Building on top of ChartEditVista, we present ChartEditor, a model trained using a reinforcement learning framework that incorporates a novel rendering reward to simultaneously enforce code executability and visual fidelity. Through extensive experiments and human evaluations, we demonstrate that ChartEditVista provides a robust evaluation, while ChartEditor consistently outperforms models with similar-scale and larger-scale on chart editing tasks.