OpusAnimation: Code-Based Dynamic Chart Generation

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the underexplored problem of dynamic chart generation and comprehension, this work introduces the first systematic research framework. Methodologically, we construct DCG-8K—a high-quality, human-annotated dataset—and propose DCG-Bench, the first multi-task benchmark covering critical modalities including text-to-chart and video-to-chart generation. We design a two-stage training paradigm integrated with a code-visual joint reward mechanism, incorporating instruction-code-video triplet learning, population-based relative policy optimization, and cross-modal joint modeling. Empirically, our model—despite using only 3B parameters—outperforms the strongest open-source baseline by 8.31% on DCG-Bench and matches the performance of leading closed-source commercial models. This work fills critical gaps in both evaluation infrastructure and technical methodology for dynamic chart generation, significantly advancing generation fidelity and spatiotemporal consistency.

Technology Category

Application Category

📝 Abstract
Dynamic Chart Generation (DCG) involves producing code-rendered animated visualizations as charts. While recent advances in multi-modal large language models (MLLMs) have significantly improved their capability on static chart generation and comprehension, MLLMs' potential for handling dynamic chart generation and understanding remains underexplored. To bridge this research gap, we introduce DCG-Bench (Dynamic Chart Generation Benchmark), the first benchmark evaluating MLLM's capability on dynamic chart generation tasks from three dimensions: Simple Text-to-Chart, Detailed Text-to-Chart, and Video-to-Chart tasks. We construct DCG-8K, a high-quality DCG dataset with annotations covering instruction-code-video triplets and QA pairs for both code and video evaluation. Based on DCG-8K, we explored a two-stage training recipe, proposing Joint-Code-Visual Reward for group relative policy optimization to construct expert MLLM Qwen2.5-VL-DCG-3B for the DCG task. Our benchmarking result reveals shortcomings of existing MLLMs in the visual-to-chart task, and our model beats the best open-sourced MLLM with an average 8.31% performance gain across three tasks, and shows on par performance against proprietary models with only 3B parameters, proving the effectiveness of our training recipe. Our code and dataset will be publicly available.
Problem

Research questions and friction points this paper is trying to address.

Dynamic chart generation and understanding using MLLMs
Creating a benchmark for dynamic chart generation tasks
Developing an expert model for improved chart generation performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing DCG-Bench benchmark for dynamic chart evaluation
Creating DCG-8K dataset with instruction-code-video triplets
Proposing Joint-Code-Visual Reward for policy optimization training
🔎 Similar Papers
No similar papers found.
B
Bozheng Li
Opus AI Research, Brown University
M
Miao Yang
Opus AI Research
Z
Zhenhan Chen
Opus AI Research
J
Jiawang Cao
Opus AI Research
Mushui Liu
Mushui Liu
Zhejiang University
Generative ModelsMulti-modal LearningFew-shot Learning
Y
Yi Lu
Opus AI Research, University of Toronto
Yongliang Wu
Yongliang Wu
Southeast University
Vision-Language Model
B
Bin Zhang
Opus AI Research
Y
Yangguang Ji
Opus AI Research
L
Licheng Tang
Opus AI Research
J
Jay Wu
Opus AI Research
W
Wenbo Zhu
Opus AI Research