Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

📅 2024-08-20
🏛️ arXiv.org
📈 Citations: 8
Influential: 0
📄 PDF
🤖 AI Summary
Existing financial large language models (LLMs) suffer from data scarcity, weak multimodal capabilities, and narrow evaluation protocols. To address these limitations, we introduce FinLLaMA—the first open-source multimodal LLM family for finance—comprising FinLLaMA, FinLLaMA-Instruct, and FinLLaVA. The models jointly understand text, tabular data, time-series signals, and charts, and natively support zero-shot, few-shot, and fine-tuning paradigms. We propose a novel dual multimodal benchmark covering 14 financial task categories and 30 evaluation metrics. Built upon the LLaMA architecture, FinLLaMA is pretrained on 52B tokens of financial text, instruction-tuned on 573K financial prompts, and further aligned with 1.43M multimodal samples. Extensive experiments demonstrate state-of-the-art performance across financial NLP, decision reasoning, and cross-modal analysis—outperforming GPT-4 and other baselines. All code and models are released under an OSI-certified license to foster academic and industrial advancement.

Technology Category

Application Category

📝 Abstract
Financial LLMs hold promise for advancing financial tasks and domain-specific applications. However, they are limited by scarce corpora, weak multimodal capabilities, and narrow evaluations, making them less suited for real-world application. To address this, we introduce extit{Open-FinLLMs}, the first open-source multimodal financial LLMs designed to handle diverse tasks across text, tabular, time-series, and chart data, excelling in zero-shot, few-shot, and fine-tuning settings. The suite includes FinLLaMA, pre-trained on a comprehensive 52-billion-token corpus; FinLLaMA-Instruct, fine-tuned with 573K financial instructions; and FinLLaVA, enhanced with 1.43M multimodal tuning pairs for strong cross-modal reasoning. We comprehensively evaluate Open-FinLLMs across 14 financial tasks, 30 datasets, and 4 multimodal tasks in zero-shot, few-shot, and supervised fine-tuning settings, introducing two new multimodal evaluation datasets. Our results show that Open-FinLLMs outperforms afvanced financial and general LLMs such as GPT-4, across financial NLP, decision-making, and multi-modal tasks, highlighting their potential to tackle real-world challenges. To foster innovation and collaboration across academia and industry, we release all codes (https://anonymous.4open.science/r/PIXIU2-0D70/B1D7/LICENSE) and models under OSI-approved licenses.
Problem

Research questions and friction points this paper is trying to address.

Addressing limited financial corpora and weak multimodal capabilities in Financial LLMs
Introducing open-source multimodal Financial LLMs for diverse financial tasks
Enhancing performance in financial NLP, decision-making, and multimodal tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source multimodal financial LLMs
Pre-trained on 52B-token corpus
Enhanced with 1.43M multimodal pairs
🔎 Similar Papers
Qianqian Xie
Qianqian Xie
Wuhan University
NLPLLM
D
Dong Li
The Fin AI
Mengxi Xiao
Mengxi Xiao
Wuhan University
PsychologyLarge Language Model
Zihao Jiang
Zihao Jiang
Wuhan University
Knowledge Graph 、Natural Language Processing、Large Language Model
R
Ruoyu Xiang
The Fin AI
X
Xiao Zhang
The Fin AI
Z
Zhengyu Chen
The Fin AI
Yueru He
Yueru He
Columbia University
FinanceLarge Language Models
W
Weiguang Han
Columbia University
Y
Yuzhe Yang
The Chinese University of Hong Kong, Shenzhen
Shunian Chen
Shunian Chen
The Chinese University of Hong Kong, Shenzhen
Large Language ModelsMultimodal Large Language ModelsAgent
Y
Yifei Zhang
Nanjing University
L
Lihang Shen
Columbia University
D
Daniel Kim
Rensselaer Polytechnic Institute
Z
Zhiwei Liu
The University of Manchester
Zheheng Luo
Zheheng Luo
NaCTeM, University of Manchester
Natural Language Processing
Yangyang Yu
Yangyang Yu
Stevens Institute of Technology
Cognitive ScienceLanguage Agent DesignBayesian InferenceMulti-modal Learning
Yupeng Cao
Yupeng Cao
Stevens Institute of Technology
Natural Language ProcessingMultiModalTrustworthy AI
Z
Zhiyang Deng
Stevens Institute of Technology
Zhiyuan Yao
Zhiyuan Yao
Ph.D. in Financial Engineering, Stevens Institute of Technology
Reinforcement LearningMachine LearningML/RL in Financial Trading
Haohang Li
Haohang Li
Stevens Institute of Technology
Mechanistic InterpretabilityLanguage ModelLLM AgentFinTech
Duanyu Feng
Duanyu Feng
Sichuan University
Machine learningNumerical optimizationNature language processing
Yongfu Dai
Yongfu Dai
Sichuan University
V
VijayaSai Somasundaram
University of Florida
P
Peng Lu
University of Montreal
Y
Yilun Zhao
Yale University
Y
Yitao Long
New York University
Guojun Xiong
Guojun Xiong
Harvard University, Department of Computer Science
Reinforcement learningRestless banditsNetworkingFinancial Agent
K
Kaleb Smith
NVIDIA
H
Honghai Yu
Nanjing University
Yanzhao Lai
Yanzhao Lai
Southwest Jiaotong Univesrity
EntrepreneurshipFintechHRM
M
Min Peng
Columbia University
J
Jianyun Nie
University of Montreal
J
Jordan W. Suchow
Stevens Institute of Technology
Xiao-Yang Liu
Xiao-Yang Liu
Columbia University
TensorDeep LearningReinforcement LearningBig Data
Benyou Wang
Benyou Wang
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
large language modelsnatural language processinginformation retrievalapplied machine learning
Alejandro Lopez-Lira
Alejandro Lopez-Lira
Assistant Professor of Finance, University of Florida
FintechMachine LearningAsset PricingMacro FinancePrivate Equity
Jimin Huang
Jimin Huang
The Fin AI
computational finance
Sophia Ananiadou
Sophia Ananiadou
Professor, Computer Science, Manchester University, National Centre for Text Mining
Natural Language ProcessingText MiningComputational LinguisticsArtificial Intelligence