Climate Finance Bench

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A lack of high-quality, multi-reasoning benchmark datasets hinders evaluation of enterprise climate disclosure question answering (QA). Method: We introduce ClimateQA—the first open-source climate disclosure QA benchmark—comprising 33 sustainability reports across 11 industries and 330 expert-annotated QA pairs, supporting extraction, numerical, and logical reasoning tasks. We conduct the first systematic analysis of retrieval-augmented generation (RAG) for climate QA, revealing that retriever accuracy in localizing answer-bearing passages is the primary performance bottleneck. Contribution/Results: Our evaluation demonstrates that retrieval precision critically determines end-to-end QA performance. We further propose low-carbon AI practices—including weight quantization—to advance transparent carbon reporting and trustworthy green AI. ClimateQA provides a reproducible, domain-specific evaluation framework, addressing a critical gap in climate-related QA benchmarking.

Technology Category

Application Category

📝 Abstract
Climate Finance Bench introduces an open benchmark that targets question-answering over corporate climate disclosures using Large Language Models. We curate 33 recent sustainability reports in English drawn from companies across all 11 GICS sectors and annotate 330 expert-validated question-answer pairs that span pure extraction, numerical reasoning, and logical reasoning. Building on this dataset, we propose a comparison of RAG (retrieval-augmented generation) approaches. We show that the retriever's ability to locate passages that actually contain the answer is the chief performance bottleneck. We further argue for transparent carbon reporting in AI-for-climate applications, highlighting advantages of techniques such as Weight Quantization.
Problem

Research questions and friction points this paper is trying to address.

Develops benchmark for LLM-based climate disclosure QA
Evaluates RAG approaches for sustainability report analysis
Addresses retriever accuracy as key performance bottleneck
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open benchmark for climate disclosure QA
RAG approach comparison on annotated dataset
Weight Quantization for carbon efficiency
🔎 Similar Papers
No similar papers found.
R
Rafik Mankour
Institut Louis Bachelier
Y
Yassine Chafai
Institut Louis Bachelier
H
Hamada Saleh
Institut Louis Bachelier
G
Ghassen Ben Hassine
Institut Louis Bachelier
T
Thibaud Barreau
Institut Louis Bachelier
Peter Tankov
Peter Tankov
ENSAE ParisTech
Mathematical financeLévy processes