Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data

πŸ“… 2025-03-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current multimodal large language models (MLLMs) suffer from limited logical reasoning capabilities on complex charts due to the scarcity of high-quality, fine-grained reasoning data; prompt-based generation methods struggle to simultaneously ensure accuracy and diversity. To address this, we propose a function-chain-driven data synthesis paradigm: atomic function chains (e.g., extremum extraction, arithmetic operations) are programmatically enumerated to automatically construct diverse, precise, and interpretable fine-grained reasoning paths, which are then converted into natural-language question-answer pairs using a lightweight open-source LLM. This approach avoids reliance on massive models, mitigates hallucination, and natively supports attribution analysis. Based on it, we introduce ChartCoFβ€”a dataset comprising 1.4k fine-grained reasoning chains and 50k augmented QA pairs. ChartCoF significantly improves MLLM performance on mainstream chart reasoning benchmarks at comparable scale and, for the first time, systematically reveals model capability disparities across distinct reasoning types.

Technology Category

Application Category

πŸ“ Abstract
Visual reasoning is crucial for multimodal large language models (MLLMs) to address complex chart queries, yet high-quality rationale data remains scarce. Existing methods leveraged (M)LLMs for data generation, but direct prompting often yields limited precision and diversity. In this paper, we propose extit{Chain of Functions (CoF)}, a novel programmatic reasoning data generation pipeline that utilizes freely-explored reasoning paths as supervision to ensure data precision and diversity. Specifically, it starts with human-free exploration among the atomic functions (e.g., maximum data and arithmetic operations) to generate diverse function chains, which are then translated into linguistic rationales and questions with only a moderate open-sourced LLM. extit{CoF} provides multiple benefits: 1) Precision: function-governed generation reduces hallucinations compared to freeform generation; 2) Diversity: enumerating function chains enables varied question taxonomies; 3) Explainability: function chains serve as built-in rationales, allowing fine-grained evaluation beyond overall accuracy; 4) Practicality: eliminating reliance on extremely large models. Employing extit{CoF}, we construct the extit{ChartCoF} dataset, with 1.4k complex reasoning Q&A for fine-grained analysis and 50k Q&A for reasoning enhancement. The fine-grained evaluation on extit{ChartCoF} reveals varying performance across question taxonomies for each MLLM, and the experiments also show that finetuning with extit{ChartCoF} achieves state-of-the-art performance among same-scale MLLMs on widely used benchmarks. Furthermore, the novel paradigm of function-governed rationale generation in extit{CoF} could inspire broader applications beyond charts.
Problem

Research questions and friction points this paper is trying to address.

Generates precise and diverse chart reasoning data using programmatic pipelines.
Addresses limitations of direct prompting in multimodal language models.
Enhances fine-grained analysis and reasoning in multimodal language models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Programmatic pipeline for chart reasoning data
Utilizes function chains for precision and diversity
Generates linguistic rationales with moderate LLMs
πŸ”Ž Similar Papers
No similar papers found.
Z
Zijian Li
Hong Kong University of Science and Technology
Jingjing Fu
Jingjing Fu
MS
image/video processing
L
Lei Song
Microsoft Research
J
Jiang Bian
Microsoft Research
J
Jun Zhang
Hong Kong University of Science and Technology
R
Rui Wang
Microsoft Research