FINCH: Financial Intelligence using Natural language for Contextualized SQL Handling

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

139K/year

🤖 AI Summary

Financial Text-to-SQL faces significant challenges—including highly complex schemas, domain-specific terminology, and high error costs—yet lacks large-scale, domain-specific benchmarks and tailored evaluation metrics. To address this gap, we introduce FinSQL, the first comprehensive financial-domain benchmark, comprising 292 tables and 75,725 high-quality question-SQL pairs. We propose FINCH Score, a semantics-aware evaluation metric that precisely quantifies correctness across critical financial SQL dimensions: numerical expressions, temporal logic, and conditional reasoning. Leveraging large language models and reasoning-augmented architectures, we integrate in-context learning with systematic evaluation, achieving substantial improvements in SQL generation accuracy over complex financial schemas. Extensive experiments identify persistent performance bottlenecks across diverse model families in financial settings, establishing a reproducible, comparable foundation for future domain-specific Text-to-SQL research.

Technology Category

Application Category

📝 Abstract

Text-to-SQL, the task of translating natural language questions into SQL queries, has long been a central challenge in NLP. While progress has been significant, applying it to the financial domain remains especially difficult due to complex schema, domain-specific terminology, and high stakes of error. Despite this, there is no dedicated large-scale financial dataset to advance research, creating a critical gap. To address this, we introduce a curated financial dataset (FINCH) comprising 292 tables and 75,725 natural language-SQL pairs, enabling both fine-tuning and rigorous evaluation. Building on this resource, we benchmark reasoning models and language models of varying scales, providing a systematic analysis of their strengths and limitations in financial Text-to-SQL tasks. Finally, we propose a finance-oriented evaluation metric (FINCH Score) that captures nuances overlooked by existing measures, offering a more faithful assessment of model performance.

Problem

Research questions and friction points this paper is trying to address.

Addressing the lack of financial Text-to-SQL datasets

Benchmarking model performance on complex financial schemas

Proposing a specialized evaluation metric for finance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Curated financial dataset for Text-to-SQL fine-tuning

Benchmarked reasoning and language models systematically

Proposed finance-oriented evaluation metric FINCH Score

🔎 Similar Papers

A Survey on Employing Large Language Models for Text-to-SQL Tasks