ChartAnchor: Chart Grounding with Structural-Semantic Fidelity

📅 2025-11-30

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Existing chart understanding benchmarks suffer from limited chart type diversity, fragmented tasks, and incomplete evaluation protocols, hindering rigorous assessment of structured chart grounding. To address this, we introduce ChartAnchor—the first large-scale benchmark for bidirectional visual-data alignment—comprising 30 real-world chart types and over 8,000 chart-table-code triplets, supporting both chart-to-code generation and controlled table reconstruction. We propose a novel multi-level verification framework integrating code executability, header-level structural constraints, and semantic fidelity, enabling the first unified evaluation of syntactic structure and semantic correctness. Experiments reveal significant deficiencies in mainstream multimodal large language models (MLLMs) regarding numerical accuracy and structured code generation, underscoring the necessity of structured reasoning. ChartAnchor establishes a new methodological foundation and evaluation standard for trustworthy chart understanding in scientific, financial, and other data-intensive domains.

Technology Category

Application Category

📝 Abstract

Recent advances in multimodal large language models (MLLMs) highlight the need for benchmarks that rigorously evaluate structured chart comprehension.Chart grounding refers to the bidirectional alignment between a chart's visual appearance and the structured semantics. This task requires models to produce a symbolic specification that faithfully captures the chart's visual and structural intent, while also recovering the underlying tabular data with precise values and relationships. Chart grounding directly reflects a model's capabilities in numerical reasoning, multimodal alignment, and structural reconstruction, and has several important applications in real-world scenarios.Existing benchmarks, constrained by narrow chart diversity, isolated tasks, and incomplete evaluation frameworks, fail to holistically assess grounding. To address this, we propose ChartAnchor, a comprehensive benchmark of 8k+ chart-table-code triples spanning 30 chart types drawn from diverse real-world and augmented sources. ChartAnchor introduces two complementary tasks: chart-to-code generation (synthesizing executable code to replicate charts) and controlled chart-to-table reconstruction (extracting exact data with predefined headers), enabling cross-validation of visual and numerical fidelity. A multi-level evaluation framework integrates semantic validation, stylistic analysis, and perceptual metrics to assess both structural and content-level correctness. Extensive experiments on MLLMs reveal critical limitations in numerical precision and code synthesis, emphasizing the need for structured reasoning beyond surface-level perception. By unifying symbolic and data-driven grounding, ChartAnchor establishes a rigorous foundation for chart grounding, offering meaningful insights for advancing MLLMs in scientific, financial, and industrial domains.

Problem

Research questions and friction points this paper is trying to address.

Evaluates multimodal models' chart comprehension via visual-structural alignment.

Benchmarks chart grounding with diverse chart types and real-world data.

Assesses numerical reasoning and code synthesis for chart reconstruction.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces ChartAnchor benchmark with 8k+ chart-table-code triples

Proposes chart-to-code generation and controlled chart-to-table reconstruction tasks

Uses multi-level evaluation framework for structural and content correctness

🔎 Similar Papers

No similar papers found.