FinMR: A Knowledge-Intensive Multimodal Benchmark for Advanced Financial Reasoning

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Existing multimodal large language models (MLLMs) lack rigorous, domain-specific evaluation benchmarks for advanced financial reasoning, hindering systematic assessment of their high-order cognitive capabilities. Method: We introduce FinBench—the first knowledge-intensive, multimodal benchmark explicitly designed for advanced financial reasoning—covering 15 specialized subdomains with over 3,200 expert-annotated multimodal question-answer pairs. It integrates complex formula computation, financial statement interpretation, chart semantic analysis, and cross-modal deep reasoning. We propose a novel three-dimensional evaluation framework—“visual parsing–formula modeling–contextual reasoning”—enabling the first joint quantification of visual grounding accuracy, domain-specific formula application fidelity, and long-range semantic coherence. Contribution/Results: Comprehensive evaluation of leading open- and closed-source MLLMs reveals critical deficiencies in precise image localization, dynamic formula derivation, and domain-logical consistency. FinBench establishes a reproducible, fine-grained benchmark and provides concrete, actionable directions for advancing professional multimodal AI.

Technology Category

Application Category

📝 Abstract

Multimodal Large Language Models (MLLMs) have made substantial progress in recent years. However, their rigorous evaluation within specialized domains like finance is hindered by the absence of datasets characterized by professional-level knowledge intensity, detailed annotations, and advanced reasoning complexity. To address this critical gap, we introduce FinMR, a high-quality, knowledge-intensive multimodal dataset explicitly designed to evaluate expert-level financial reasoning capabilities at a professional analyst's standard. FinMR comprises over 3,200 meticulously curated and expertly annotated question-answer pairs across 15 diverse financial topics, ensuring broad domain diversity and integrating sophisticated mathematical reasoning, advanced financial knowledge, and nuanced visual interpretation tasks across multiple image types. Through comprehensive benchmarking with leading closed-source and open-source MLLMs, we highlight significant performance disparities between these models and professional financial analysts, uncovering key areas for model advancement, such as precise image analysis, accurate application of complex financial formulas, and deeper contextual financial understanding. By providing richly varied visual content and thorough explanatory annotations, FinMR establishes itself as an essential benchmark tool for assessing and advancing multimodal financial reasoning toward professional analyst-level competence.

Problem

Research questions and friction points this paper is trying to address.

Evaluating expert-level financial reasoning in multimodal models

Addressing absence of knowledge-intensive financial datasets

Assessing professional analyst-level multimodal reasoning capabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduced knowledge-intensive multimodal financial reasoning benchmark

Curated expert-annotated dataset with mathematical and visual tasks

Established comprehensive evaluation framework for financial analysis capabilities

🔎 Similar Papers

FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering