FinGEAR: Financial Mapping-Guided Enhanced Answer Retrieval

📅 2025-09-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Standard RAG exhibits low retrieval accuracy on lengthy, hierarchically structured, and terminology-dense financial disclosure documents (e.g., 10-K filings). To address this, we propose FinGEAR—the first fine-grained, query-aware retrieval framework tailored for financial documents. Methodologically, FinGEAR integrates a domain-enhanced lexicon (FLAM) with a dual-layer index structure (summary tree + question tree) to explicitly model regulatory section hierarchies and semantic relationships. It further employs a two-stage cross-encoder re-ranking mechanism to strengthen query–passage semantic alignment. Experiments on full 10-K documents demonstrate that FinGEAR improves F1 score by 56.7% over flat RAG baselines and significantly boosts downstream question answering accuracy. By enabling precise, interpretable, and robust retrieval, FinGEAR establishes a foundational advancement for high-stakes financial analysis.

Technology Category

Application Category

📝 Abstract
Financial disclosures such as 10-K filings present challenging retrieval problems due to their length, regulatory section hierarchy, and domain-specific language, which standard retrieval-augmented generation (RAG) models underuse. We introduce FinGEAR (Financial Mapping-Guided Enhanced Answer Retrieval), a retrieval framework tailored to financial documents. FinGEAR combines a finance lexicon for Item-level guidance (FLAM), dual hierarchical indices for within-Item search (Summary Tree and Question Tree), and a two-stage cross-encoder reranker. This design aligns retrieval with disclosure structure and terminology, enabling fine-grained, query-aware context selection. Evaluated on full 10-Ks with queries aligned to the FinQA dataset, FinGEAR delivers consistent gains in precision, recall, F1, and relevancy, improving F1 by up to 56.7% over flat RAG, 12.5% over graph-based RAGs, and 217.6% over prior tree-based systems, while also increasing downstream answer accuracy with a fixed reader. By jointly modeling section hierarchy and domain lexicon signals, FinGEAR improves retrieval fidelity and provides a practical foundation for high-stakes financial analysis.
Problem

Research questions and friction points this paper is trying to address.

Retrieving precise answers from lengthy financial disclosures
Overcoming domain-specific language challenges in financial documents
Enhancing retrieval accuracy using hierarchical structure and finance lexicon
Innovation

Methods, ideas, or system contributions that make the work stand out.

Finance lexicon guidance for retrieval
Dual hierarchical indices for search
Two-stage cross-encoder reranker system
🔎 Similar Papers
No similar papers found.
Y
Ying Li
The University of Edinburgh, United Kingdom
M
Mengyu Wang
The University of Edinburgh, United Kingdom
Miguel de Carvalho
Miguel de Carvalho
School of Mathematics, University of Edinburgh
Statistics of ExtremesHeavy TailsFinancial Statistics
Sotirios Sabanis
Sotirios Sabanis
Professor, University of Edinburgh & National Technical University of Athens
Stochastic AnalysisNumericsMathematical FinanceComputational StatisticsData Science
T
Tiejun Ma
The University of Edinburgh, United Kingdom; The Artificial Intelligence Applications Institute, The University of Edinburgh, United Kingdom