Climate Finance Bench

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

A lack of high-quality, multi-reasoning benchmark datasets hinders evaluation of enterprise climate disclosure question answering (QA). Method: We introduce ClimateQA—the first open-source climate disclosure QA benchmark—comprising 33 sustainability reports across 11 industries and 330 expert-annotated QA pairs, supporting extraction, numerical, and logical reasoning tasks. We conduct the first systematic analysis of retrieval-augmented generation (RAG) for climate QA, revealing that retriever accuracy in localizing answer-bearing passages is the primary performance bottleneck. Contribution/Results: Our evaluation demonstrates that retrieval precision critically determines end-to-end QA performance. We further propose low-carbon AI practices—including weight quantization—to advance transparent carbon reporting and trustworthy green AI. ClimateQA provides a reproducible, domain-specific evaluation framework, addressing a critical gap in climate-related QA benchmarking.

Technology Category

Application Category

📝 Abstract

Climate Finance Bench introduces an open benchmark that targets question-answering over corporate climate disclosures using Large Language Models. We curate 33 recent sustainability reports in English drawn from companies across all 11 GICS sectors and annotate 330 expert-validated question-answer pairs that span pure extraction, numerical reasoning, and logical reasoning. Building on this dataset, we propose a comparison of RAG (retrieval-augmented generation) approaches. We show that the retriever's ability to locate passages that actually contain the answer is the chief performance bottleneck. We further argue for transparent carbon reporting in AI-for-climate applications, highlighting advantages of techniques such as Weight Quantization.

Problem

Research questions and friction points this paper is trying to address.

Develops benchmark for LLM-based climate disclosure QA

Evaluates RAG approaches for sustainability report analysis

Addresses retriever accuracy as key performance bottleneck

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open benchmark for climate disclosure QA

RAG approach comparison on annotated dataset

Weight Quantization for carbon efficiency

🔎 Similar Papers

The impact of internal variability on benchmarking deep learning climate emulators