Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

📅 2024-08-20

🏛️ arXiv.org

📈 Citations: 8

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing financial large language models (LLMs) suffer from data scarcity, weak multimodal capabilities, and narrow evaluation protocols. To address these limitations, we introduce FinLLaMA—the first open-source multimodal LLM family for finance—comprising FinLLaMA, FinLLaMA-Instruct, and FinLLaVA. The models jointly understand text, tabular data, time-series signals, and charts, and natively support zero-shot, few-shot, and fine-tuning paradigms. We propose a novel dual multimodal benchmark covering 14 financial task categories and 30 evaluation metrics. Built upon the LLaMA architecture, FinLLaMA is pretrained on 52B tokens of financial text, instruction-tuned on 573K financial prompts, and further aligned with 1.43M multimodal samples. Extensive experiments demonstrate state-of-the-art performance across financial NLP, decision reasoning, and cross-modal analysis—outperforming GPT-4 and other baselines. All code and models are released under an OSI-certified license to foster academic and industrial advancement.

Technology Category

Application Category

📝 Abstract

Financial LLMs hold promise for advancing financial tasks and domain-specific applications. However, they are limited by scarce corpora, weak multimodal capabilities, and narrow evaluations, making them less suited for real-world application. To address this, we introduce extit{Open-FinLLMs}, the first open-source multimodal financial LLMs designed to handle diverse tasks across text, tabular, time-series, and chart data, excelling in zero-shot, few-shot, and fine-tuning settings. The suite includes FinLLaMA, pre-trained on a comprehensive 52-billion-token corpus; FinLLaMA-Instruct, fine-tuned with 573K financial instructions; and FinLLaVA, enhanced with 1.43M multimodal tuning pairs for strong cross-modal reasoning. We comprehensively evaluate Open-FinLLMs across 14 financial tasks, 30 datasets, and 4 multimodal tasks in zero-shot, few-shot, and supervised fine-tuning settings, introducing two new multimodal evaluation datasets. Our results show that Open-FinLLMs outperforms afvanced financial and general LLMs such as GPT-4, across financial NLP, decision-making, and multi-modal tasks, highlighting their potential to tackle real-world challenges. To foster innovation and collaboration across academia and industry, we release all codes (https://anonymous.4open.science/r/PIXIU2-0D70/B1D7/LICENSE) and models under OSI-approved licenses.

Problem

Research questions and friction points this paper is trying to address.

Addressing limited financial corpora and weak multimodal capabilities in Financial LLMs

Introducing open-source multimodal Financial LLMs for diverse financial tasks

Enhancing performance in financial NLP, decision-making, and multimodal tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source multimodal financial LLMs

Pre-trained on 52B-token corpus

Enhanced with 1.43M multimodal pairs

🔎 Similar Papers

FMDLlama: Financial Misinformation Detection based on Large Language Models