Anagent For Enhancing Scientific Table&Figure Analysis

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Scientific papers feature tables and figures that are structurally complex, multimodal, and highly context-dependent, posing significant challenges for accurate parsing by current AI systems. To address this, this work proposes Anagent, a multi-agent framework comprising four collaborative agents—planning, expert retrieval, solving, and critique—to enable deep understanding and reasoning over scientific visual content. Key contributions include the first multi-agent collaboration mechanism tailored for scientific chart analysis, the construction of AnaBench—a seven-dimensional complexity benchmark with 63,178 samples—and a five-dimensional quality-assessment-driven iterative refinement pipeline integrated with modular training strategies (supervised fine-tuning and domain-specific reinforcement learning). Evaluated across 170 subfields, the approach achieves up to a 13.43% improvement in zero-shot performance and 42.12% after fine-tuning, underscoring the critical role of task-oriented reasoning and context awareness.

Technology Category

Application Category

📝 Abstract

In scientific research, analysis requires accurately interpreting complex multimodal knowledge, integrating evidence from different sources, and drawing inferences grounded in domain-specific knowledge. However, current artificial intelligence (AI) systems struggle to consistently demonstrate such capabilities. The complexity and variability of scientific tables and figures, combined with heterogeneous structures and long-context requirements, pose fundamental obstacles to scientific table \&figure analysis. To quantify these challenges, we introduce AnaBench, a large-scale benchmark featuring $63,178$ instances from nine scientific domains, systematically categorized along seven complexity dimensions. To tackle these challenges, we propose Anagent, a multi-agent framework for enhanced scientific table \&figure analysis through four specialized agents: Planner decomposes tasks into actionable subtasks, Expert retrieves task-specific information through targeted tool execution, Solver synthesizes information to generate coherent analysis, and Critic performs iterative refinement through five-dimensional quality assessment. We further develop modular training strategies that leverage supervised finetuning and specialized reinforcement learning to optimize individual capabilities while maintaining effective collaboration. Comprehensive evaluation across 9 broad domains with 170 subdomains demonstrates that Anagent achieves substantial improvements, up to $\uparrow 13.43\%$ in training-free settings and $\uparrow 42.12\%$ with finetuning, while revealing that task-oriented reasoning and context-aware problem-solving are essential for high-quality scientific table \&figure analysis. Our project page: https://xhguo7.github.io/Anagent/.

Problem

Research questions and friction points this paper is trying to address.

scientific table analysis

figure interpretation

multimodal reasoning

complexity in scientific data

AI for scientific understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent framework

scientific table and figure analysis

modular training