Program of Thoughts for Financial Reasoning: Leveraging Dynamic In-Context Examples and Generative Retrieval

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

To address the low accuracy of large language models (LLMs) on financial numerical reasoning tasks—such as FinQA and ConvFinQA—this paper proposes FINDER, a novel framework that synergistically integrates generative retrieval with context-aware Program-of-Thought (PoT) prompting. FINDER dynamically retrieves domain-critical facts from unstructured text and tabular data while adaptively selecting in-context few-shot examples to enhance both multimodal understanding and precise numerical computation. Its core innovations include a generative-retrieval-driven fact extraction mechanism and a context-aware PoT chain construction strategy that explicitly grounds reasoning steps in retrieved evidence. Extensive experiments demonstrate that FINDER achieves state-of-the-art performance, improving execution accuracy by 5.98% on FinQA and 4.05% on ConvFinQA over prior methods.

Technology Category

Application Category

📝 Abstract

Despite continuous advancements in the capabilities of large language models (LLMs), numerical reasoning remains a challenging area. Techniques like chain-of-thought prompting, tree-of-thought prompting, and program-of-thought prompting guide LLMs through intermediate reasoning steps. Although in-context learning with few-shot prompting has improved performance, LLMs still lag behind state-of-the-art models on financial numerical reasoning datasets such as FinQA and ConvFinQA. In this work, we introduce FINDER, a novel two-step framework, to enhance LLMs' capabilities in financial numerical reasoning. The first step utilizes a generative retriever to extract relevant facts from unstructured data, including both text and tables. This is followed by context-aware Program of Thought prompting with dynamic selection of in-context examples. Our model FINDER achieves a new state-of-the-art performance on both the FinQA and ConvFinQA datasets, surpassing previous benchmarks with execution accuracy improvements of 5.98% and 4.05%, respectively.

Problem

Research questions and friction points this paper is trying to address.

Enhancing financial numerical reasoning of large language models

Extracting relevant facts from unstructured financial data

Improving accuracy on FinQA and ConvFinQA financial datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative retriever extracts facts from unstructured data

Dynamic selection of in-context examples for reasoning

Context-aware Program of Thought prompting enhances accuracy

🔎 Similar Papers

No similar papers found.