FINCH: Financial Intelligence using Natural language for Contextualized SQL Handling

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Financial Text-to-SQL faces significant challenges—including highly complex schemas, domain-specific terminology, and high error costs—yet lacks large-scale, domain-specific benchmarks and tailored evaluation metrics. To address this gap, we introduce FinSQL, the first comprehensive financial-domain benchmark, comprising 292 tables and 75,725 high-quality question-SQL pairs. We propose FINCH Score, a semantics-aware evaluation metric that precisely quantifies correctness across critical financial SQL dimensions: numerical expressions, temporal logic, and conditional reasoning. Leveraging large language models and reasoning-augmented architectures, we integrate in-context learning with systematic evaluation, achieving substantial improvements in SQL generation accuracy over complex financial schemas. Extensive experiments identify persistent performance bottlenecks across diverse model families in financial settings, establishing a reproducible, comparable foundation for future domain-specific Text-to-SQL research.

Technology Category

Application Category

📝 Abstract
Text-to-SQL, the task of translating natural language questions into SQL queries, has long been a central challenge in NLP. While progress has been significant, applying it to the financial domain remains especially difficult due to complex schema, domain-specific terminology, and high stakes of error. Despite this, there is no dedicated large-scale financial dataset to advance research, creating a critical gap. To address this, we introduce a curated financial dataset (FINCH) comprising 292 tables and 75,725 natural language-SQL pairs, enabling both fine-tuning and rigorous evaluation. Building on this resource, we benchmark reasoning models and language models of varying scales, providing a systematic analysis of their strengths and limitations in financial Text-to-SQL tasks. Finally, we propose a finance-oriented evaluation metric (FINCH Score) that captures nuances overlooked by existing measures, offering a more faithful assessment of model performance.
Problem

Research questions and friction points this paper is trying to address.

Addressing the lack of financial Text-to-SQL datasets
Benchmarking model performance on complex financial schemas
Proposing a specialized evaluation metric for finance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Curated financial dataset for Text-to-SQL fine-tuning
Benchmarked reasoning and language models systematically
Proposed finance-oriented evaluation metric FINCH Score
🔎 Similar Papers
No similar papers found.
A
Avinash Kumar Singh
Domyn, Hyderabad, India
Bhaskarjit Sarmah
Bhaskarjit Sarmah
Domyn
Machine LearningGenerative AIAgentic AIResponsible AI
S
Stefano Pasquali
Domyn, New York, USA