DomainCQA: Crafting Expert-Level QA from Domain-Specific Charts

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Existing general-purpose Chart Question Answering (CQA) benchmarks inadequately evaluate multimodal large language models’ (MLLMs) capacity for deep reasoning that integrates visual information with domain-specific knowledge. To address this, we propose DomainCQA—a scalable, systematic methodology for constructing domain-specialized CQA benchmarks—and instantiate it in astronomy via AstroChart, which incorporates expert annotation, chart semantic parsing, and cross-modal alignment. Experiments identify MLLMs’ core bottlenecks: chart-hopping reasoning, joint analysis of multiple charts, and domain-knowledge-guided summarization—rather than mere factual recall. AstroChart establishes the first rigorous, reproducible evaluation standard for domain-specialized MLLMs, advancing multimodal model assessment toward professional application scenarios.

Technology Category

Application Category

📝 Abstract

Chart Question Answering (CQA) benchmarks are essential for evaluating the capability of Multimodal Large Language Models (MLLMs) to interpret visual data. However, current benchmarks focus primarily on the evaluation of general-purpose CQA but fail to adequately capture domain-specific challenges. We introduce DomainCQA, a systematic methodology for constructing domain-specific CQA benchmarks, and demonstrate its effectiveness by developing AstroChart, a CQA benchmark in the field of astronomy. Our evaluation shows that chart reasoning and combining chart information with domain knowledge for deeper analysis and summarization, rather than domain-specific knowledge, pose the primary challenge for existing MLLMs, highlighting a critical gap in current benchmarks. By providing a scalable and rigorous framework, DomainCQA enables more precise assessment and improvement of MLLMs for domain-specific applications.

Problem

Research questions and friction points this paper is trying to address.

Evaluating MLLMs' ability to interpret domain-specific charts

Addressing gaps in current general-purpose CQA benchmarks

Challenges in chart reasoning and domain knowledge integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-specific CQA benchmark construction methodology

AstroChart for astronomy QA evaluation

Scalable framework for MLLM assessment

🔎 Similar Papers

No similar papers found.