Aligned Multi-View Scripts for Universal Chart-to-Code Generation

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

164K/year
🤖 AI Summary
Existing approaches to chart-to-code generation are predominantly confined to Python, lacking support for multiple programming languages and supervision signals for cross-lingual semantic alignment. To address this limitation, this work introduces Chart2NCode, a dataset comprising 176K chart-image pairs accompanied by semantically equivalent scripts in Python, R, and LaTeX. Building upon the LLaVA architecture, the authors propose CharLuMA, a module that employs a language-conditioned low-rank subspace mixing mechanism to enable shared chart understanding while facilitating lightweight, language-specific code generation. Experimental results demonstrate that the proposed method significantly improves both executability and visual fidelity of generated code across all target languages, outperforming open-source baselines and achieving performance on par with closed-source systems, thereby validating the efficacy of multilingual joint training.

Technology Category

Application Category

📝 Abstract
Chart-to-code generation converts a chart image into an executable plotting script, enabling faithful reproduction and editable visualizations. Existing methods are largely Python-centric, limiting practical use and overlooking a critical source of supervision: the same chart can be expressed by semantically equivalent scripts in different plotting languages. To fill this gap, we introduce Chart2NCode, a dataset of 176K charts paired with aligned scripts in Python, R, and LaTeX that render visually equivalent outputs, constructed via a metadata-to-template pipeline with rendering verification and human quality checks. Building on a LLaVA-style architecture, we further propose CharLuMA, a parameter-efficient adaptation module that augments the multimodal projector with a language-conditioned mixture of low-rank subspaces, allowing the model to share core chart understanding while specializing code generation to the target language through lightweight routing. Extensive experiments show consistent gains in executability and visual fidelity across all languages, outperforming strong open-source baselines and remaining competitive with proprietary systems. Further analyses reveal that balanced multi-language supervision benefits all languages and that the adapter allocates a compact shared core plus language-specific capacity. Codes and data are available at https://github.com/Zhihan72/CharLuMA.
Problem

Research questions and friction points this paper is trying to address.

chart-to-code generation
multi-language supervision
visually equivalent scripts
plotting languages
code generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

chart-to-code generation
multi-view aligned scripts
language-conditioned low-rank adaptation
multimodal code generation
universal visualization