🤖 AI Summary
ESG reports are predominantly published as unstructured, format-heterogeneous PDFs, impeding standardized information extraction and cross-report alignment. To address this, we propose the first method that explicitly incorporates disclosure frameworks—such as the Sustainability Accounting Standards Board (SASB)—into large language model (LLM)-based analysis via a dual-channel retrieval-augmented system: one channel performs semantic retrieval using a vector database, while the other leverages natural language inference for precise standard clause alignment. Our system supports automated metric population, cross-firm benchmarking, and interactive exploration. Integrated end-to-end, it combines information extraction, interactive visualization dashboards, and conversational querying. Evaluated on ESG reports from four global corporations across 12 SASB sub-industries, it achieves a 0.95 average accuracy. The implementation—including source code and an interactive demo—is publicly released.
📝 Abstract
Environmental, Social, and Governance (ESG) reports have become central to how companies communicate climate risk, social impact, and governance practices, yet they are still published primarily as long, heterogeneous PDF documents. This makes it difficult to systematically answer seemingly simple questions. Existing tools either rely on brittle rule-based extraction or treat ESG reports as generic text, without explicitly modelling the underlying reporting standards. We present extbf{EulerESG}, an LLM-powered system for automating ESG disclosure analysis with explicit awareness of ESG frameworks. EulerESG combines (i) dual-channel retrieval and LLM-driven disclosure analysis over ESG reports, and (ii) an interactive dashboard and chatbot for exploration, benchmarking, and explanation. Using four globally recognised companies and twelve SASB sub-industries, we show that EulerESG can automatically populate standard-aligned metric tables with high fidelity (up to 0.95 average accuracy) while remaining practical in end-to-end runtime, and we compare several recent LLM models in this setting. The full implementation, together with a demonstration video, is publicly available at https://github.com/UNSW-database/EulerESG.