FinMMR: Make Financial Numerical Reasoning More Multimodal, Comprehensive, and Challenging

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing financial numerical reasoning benchmarks lack multimodality, comprehensiveness, and sufficient challenge, failing to assess MLLMs’ multi-step precise reasoning over complex financial visuals (e.g., ownership structure diagrams, bar charts, tables) jointly with textual context. Method: We introduce FinNum—the first bilingual (Chinese–English) multimodal financial numerical reasoning benchmark—covering 14 financial subdomains, constructed via human annotation and multi-source synthesis, comprising 4.3K questions and 8.7K images. Contribution/Results: FinNum is the first large-scale multimodal reasoning dataset systematically built from authentic Chinese financial research reports; it emphasizes finance-knowledge-driven multi-step numerical reasoning and significantly raises evaluation difficulty. Experiments show top MLLMs achieve only 53.0% accuracy on its challenging subset, exposing a critical bottleneck in domain-specific multimodal reasoning.

Technology Category

Application Category

📝 Abstract
We present FinMMR, a novel bilingual multimodal benchmark tailored to evaluate the reasoning capabilities of multimodal large language models (MLLMs) in financial numerical reasoning tasks. Compared to existing benchmarks, our work introduces three significant advancements. (1) Multimodality: We meticulously transform existing financial reasoning benchmarks, and construct novel questions from the latest Chinese financial research reports. FinMMR comprises 4.3K questions and 8.7K images spanning 14 categories, including tables, bar charts, and ownership structure charts. (2) Comprehensiveness: FinMMR encompasses 14 financial subdomains, including corporate finance, banking, and industry analysis, significantly exceeding existing benchmarks in financial domain knowledge breadth. (3) Challenge: Models are required to perform multi-step precise numerical reasoning by integrating financial knowledge with the understanding of complex financial images and text. The best-performing MLLM achieves only 53.0% accuracy on Hard problems. We believe that FinMMR will drive advancements in enhancing the reasoning capabilities of MLLMs in real-world scenarios.
Problem

Research questions and friction points this paper is trying to address.

Evaluate MLLMs' reasoning in financial numerical tasks
Expand financial domain coverage with 14 subdomains
Enhance multimodal reasoning with complex financial images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal benchmark with 4.3K questions and 8.7K images
Covers 14 financial subdomains for comprehensive evaluation
Requires multi-step numerical reasoning with financial knowledge
🔎 Similar Papers
No similar papers found.
Z
Zichen Tang
Beijing University of Posts and Telecommunications
H
Haihong E
Beijing University of Posts and Telecommunications
J
Jiacheng Liu
Beijing University of Posts and Telecommunications
Z
Zhongjun Yang
Beijing University of Posts and Telecommunications
Rongjin Li
Rongjin Li
Xiamen University, VoiceAI
speaker recognitionspeech enhancementdeep learning
Z
Zihua Rong
Beijing University of Posts and Telecommunications
H
Haoyang He
Beijing University of Posts and Telecommunications
Z
Zhuodi Hao
Beijing University of Posts and Telecommunications
X
Xinyang Hu
Beijing University of Posts and Telecommunications
K
Kun Ji
Beijing University of Posts and Telecommunications
Z
Ziyan Ma
Beijing University of Posts and Telecommunications
M
Mengyuan Ji
Beijing University of Posts and Telecommunications
J
Jun Zhang
Beijing University of Posts and Telecommunications
C
Chenghao Ma
Beijing University of Posts and Telecommunications
Q
Qianhe Zheng
Beijing University of Posts and Telecommunications
Y
Yang Liu
Beijing University of Posts and Telecommunications
Yiling Huang
Yiling Huang
PhD Student in Statistics, University of Michigan
Selective Inference
X
Xinyi Hu
Beijing University of Posts and Telecommunications
Qing Huang
Qing Huang
Chinese Academy of Science
Material Editing
Zijian Xie
Zijian Xie
University of Michigan
S
Shiyao Peng
Beijing University of Posts and Telecommunications