๐ค AI Summary
This work addresses the limitations of existing reranking methods for large language models in financial long-document question answering, which rely solely on semantic relevance and thus fail to enforce strict constraints on entities, financial metrics, fiscal years, and numerical valuesโleading to unstable and uninterpretable rankings. To overcome this, the authors propose FinCARDS, a structured reranking framework that formulates evidence selection as a constraint satisfaction problem guided by financial-aware patterns. By leveraging field-aligned card representations, deterministic field matching, multi-stage tournament reranking, and stability-aware aggregation, FinCARDS achieves auditable and highly stable evidence ranking without requiring model fine-tuning or additional inference overhead. Experiments demonstrate that the method significantly outperforms lexical and LLM-based reranking baselines on two financial QA benchmarks, substantially improving early retrieval performance and effectively reducing ranking variance.
๐ Abstract
Financial question answering (QA) over long corporate filings requires evidence to satisfy strict constraints on entities, financial metrics, fiscal periods, and numeric values. However, existing LLM-based rerankers primarily optimize semantic relevance, leading to unstable rankings and opaque decisions on long documents. We propose FinCards, a structured reranking framework that reframes financial evidence selection as constraint satisfaction under a finance-aware schema. FinCards represents filing chunks and questions using aligned schema fields (entities, metrics, periods, and numeric spans), enabling deterministic field-level matching. Evidence is selected via a multi-stage tournament reranking with stability-aware aggregation, producing auditable decision traces. Across two corporate filing QA benchmarks, FinCards substantially improves early-rank retrieval over both lexical and LLM-based reranking baselines, while reducing ranking variance, without requiring model fine-tuning or unpredictable inference budgets. Our code is available at https://github.com/XanderZhou2022/FINCARDS.