🤖 AI Summary
This study systematically investigates the practical capabilities and limitations of large language models (LLMs) in finance. Addressing four core tasks—financial report generation, market trend forecasting, investor sentiment analysis, and personalized investment advisory—we construct the first cross-task unified instruction benchmark grounded in real-world financial contexts. We conduct multi-dimensional evaluations of Transformer-based models, including GPT-4, under domain-specific constraints. Results reveal that while GPT-4 demonstrates strong instruction-following proficiency, it exhibits critical bottlenecks in domain alignment, interpretability, and regulatory compliance. To our knowledge, this is the first work to standardize multi-task LLM evaluation in finance, bridging the gap between theoretical research and industrial deployment. Our findings provide empirical grounding for LLM adoption in financial applications and identify three pivotal research directions: enhancing model interpretability, optimizing domain-specific adaptation, and integrating regulatory compliance by design. (149 words)
📝 Abstract
In recent years, Large Language Models (LLMs) like ChatGPT have seen considerable advancements and have been applied in diverse fields. Built on the Transformer architecture, these models are trained on extensive datasets, enabling them to understand and generate human language effectively. In the financial domain, the deployment of LLMs is gaining momentum. These models are being utilized for automating financial report generation, forecasting market trends, analyzing investor sentiment, and offering personalized financial advice. Leveraging their natural language processing capabilities, LLMs can distill key insights from vast financial data, aiding institutions in making informed investment choices and enhancing both operational efficiency and customer satisfaction. In this study, we provide a comprehensive overview of the emerging integration of LLMs into various financial tasks. Additionally, we conducted holistic tests on multiple financial tasks through the combination of natural language instructions. Our findings show that GPT-4 effectively follow prompt instructions across various financial tasks. This survey and evaluation of LLMs in the financial domain aim to deepen the understanding of LLMs' current role in finance for both financial practitioners and LLM researchers, identify new research and application prospects, and highlight how these technologies can be leveraged to solve practical challenges in the finance industry.