🤖 AI Summary
Financial regulatory compliance involves heterogeneous multimodal documents (text, tables, charts), and dynamically evolving regulations hinder accurate extraction of critical information. Method: This paper proposes the first multimodal precise question-answering system tailored for financial messages. It introduces a unified multimodal preprocessing pipeline, a metadata-aware hybrid sparse-dense retrieval framework, and a domain-specific re-ranker fine-tuned via Direct Preference Optimization (DPO); it further integrates HyDE-based query expansion with metadata-driven semantic search. Contribution/Results: On the FinanceBench benchmark, the system achieves a 24.06% absolute accuracy improvement and a 92.51% recall rate. Deployed as an online financial QA agent, it serves over 1,200 users, significantly enhancing robustness and accuracy in cross-modal, time-sensitive information retrieval for regulatory compliance scenarios.
📝 Abstract
Leveraging large language models in real-world settings often entails a need to utilize domain-specific data and tools in order to follow the complex regulations that need to be followed for acceptable use. Within financial sectors, modern enterprises increasingly rely on Retrieval-Augmented Generation (RAG) systems to address complex compliance requirements in financial document workflows. However, existing solutions struggle to account for the inherent heterogeneity of data (e.g., text, tables, diagrams) and evolving nature of regulatory standards used in financial filings, leading to compromised accuracy in critical information extraction. We propose the FinSage framework as a solution, utilizing a multi-aspect RAG framework tailored for regulatory compliance analysis in multi-modal financial documents. FinSage introduces three innovative components: (1) a multi-modal pre-processing pipeline that unifies diverse data formats and generates chunk-level metadata summaries, (2) a multi-path sparse-dense retrieval system augmented with query expansion (HyDE) and metadata-aware semantic search, and (3) a domain-specialized re-ranking module fine-tuned via Direct Preference Optimization (DPO) to prioritize compliance-critical content. Extensive experiments demonstrate that FinSage achieves an impressive recall of 92.51% on 75 expert-curated questions derived from surpasses the best baseline method on the FinanceBench question answering datasets by 24.06% in accuracy. Moreover, FinSage has been successfully deployed as financial question-answering agent in online meetings, where it has already served more than 1,200 people.