π€ AI Summary
This study addresses the critical gap in the Gulf region regarding systems that effectively translate multi-source climate science and policy evidence into actionable decision support, compounded by the limited capacity of general-purpose large language models (LLMs) to handle region-specific climate knowledge and interact with geospatial tools. To bridge this gap, the authors propose a unified framework comprising the first fine-grained, Gulf-focused climate question-answering dataset (GCA-DS) and a domain-specific intelligent agent (GCA) deeply integrated with geospatial tooling. The framework synthesizes policy documents, academic literature, remote sensing imagery, and extreme weather reports through multimodal alignment, domain-adaptive fine-tuning, and interpretable visualizations to enable end-to-end data-to-decision support. Experimental results demonstrate that the proposed approach significantly outperforms general LLM baselines on Gulf-specific climate tasks, underscoring the pivotal role of domain fine-tuning and tool integration in enhancing model reliability and practical utility.
π Abstract
Climate decision-making in the Gulf increasingly demands systems that can translate heterogeneous scientific and policy evidence into actionable guidance, yet general-purpose large language models (LLMs) remain weak both in region-specific climate knowledge and grounded interaction with geospatial and forecasting tools. We present the GCA framework, which unifies (i) GCA-DS, a curated Gulf-focused multimodal dataset, and (ii) Gulf Climate Agent (GCA), a tool-augmented agent for climate analysis. GCA-DS comprises ~200k question-answer pairs spanning governmental policies and adaptation plans, NGO and international frameworks, academic literature, and event-driven reporting on heatwaves, dust storms, and floods, complemented with remote-sensing inputs that couple imagery with textual evidence. Building on this foundation, the GCA agent orchestrates a modular tool pipeline grounded in real-time and historical signals and geospatial processing that produces derived indices and interpretable visualizations. Finally, we benchmark open and proprietary LLMs on Gulf climate tasks and show that domain fine-tuning and tool integration substantially improve reliability over general-purpose baselines.