Uni-FinLLM: A Unified Multimodal Large Language Model with Modular Task Heads for Micro-Level Stock Prediction and Macro-Level Systemic Risk Assessment

📅 2026-01-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first unified multimodal large language model for financial risk modeling that jointly captures micro-, meso-, and macro-scale dependencies, addressing the limitation of existing approaches that treat these scales in isolation. By integrating financial text, time-series data, fundamental indicators, and visual information through a shared Transformer backbone, modular task-specific heads, and cross-modal attention mechanisms, the model enables end-to-end collaborative prediction across risk levels. Empirical results demonstrate significant improvements over strong baselines, achieving 67.4% accuracy (+5.7%) in stock direction prediction, 84.1% in credit risk assessment, and 82.3% in macroeconomic risk early warning. This study thus establishes the first framework for end-to-end joint modeling and optimization of cross-scale financial risks.

Technology Category

Application Category

📝 Abstract
Financial institutions and regulators require systems that integrate heterogeneous data to assess risks from stock fluctuations to systemic vulnerabilities. Existing approaches often treat these tasks in isolation, failing to capture cross-scale dependencies. We propose Uni-FinLLM, a unified multimodal large language model that uses a shared Transformer backbone and modular task heads to jointly process financial text, numerical time series, fundamentals, and visual data. Through cross-modal attention and multi-task optimization, it learns a coherent representation for micro-, meso-, and macro-level predictions. Evaluated on stock forecasting, credit-risk assessment, and systemic-risk detection, Uni-FinLLM significantly outperforms baselines. It raises stock directional accuracy to 67.4% (from 61.7%), credit-risk accuracy to 84.1% (from 79.6%), and macro early-warning accuracy to 82.3%. Results validate that a unified multimodal LLM can jointly model asset behavior and systemic vulnerabilities, offering a scalable decision-support engine for finance.
Problem

Research questions and friction points this paper is trying to address.

stock prediction
systemic risk assessment
multimodal data
cross-scale dependencies
financial risk modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal large language model
modular task heads
cross-modal attention
multi-task optimization
systemic risk assessment
🔎 Similar Papers
No similar papers found.
G
Gongao Zhang
China University of Geosciences, Wuhan, China
H
Haijiang Zeng
Walmart Inc., Bentonville, AR , USA
Lu Jiang
Lu Jiang
Research Scientist @ Apple
Generative AIFoundation ModelRobust Deep LearningMultimediaVideo Generation