ClimAgent: LLM as Agents for Autonomous Open-ended Climate Science Analysis

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Climate science research is hindered by the explosion of multiscale data, fragmented analytical tools, and the limited physical grounding and complex reasoning capabilities of existing large language models (LLMs). To address these challenges, this work proposes ClimAgent—the first general-purpose autonomous analysis framework tailored for real-world climate science scenarios. ClimAgent enables LLMs to execute end-to-end, cross-subfield climate modeling tasks through a unified tool-calling mechanism and a multi-step rigorous reasoning protocol. The study also introduces ClimaBench, a comprehensive benchmark encompassing 2,000–2025 professional tasks across five major categories. Experimental results demonstrate that ClimAgent improves solution rigor and practicality by 40.21% over baseline LLMs, substantially overcoming current limitations in applying LLMs to complex scientific reasoning.

Technology Category

Application Category

📝 Abstract

Climate research is pivotal for mitigating global environmental crises, yet the accelerating volume of multi-scale datasets and the complexity of analytical tools have created significant bottlenecks, constraining scientific discovery to fragmented and labor-intensive workflows. While the emergence Large Language Models (LLMs) offers a transformative paradigm to scale scientific expertise, existing explorations remain largely confined to simple Question-Answering (Q&A) tasks. These approaches often oversimplify real-world challenges, neglecting the intricate physical constraints and the data-driven nature required in professional climate science.To bridge this gap, we introduce ClimAgent, a general-purpose autonomous framework designed to execute a wide spectrum of research tasks across diverse climate sub-fields. By integrating a unified tool-use environment with rigorous reasoning protocols, ClimAgent transcends simple retrieval to perform end-to-end modeling and analysis.To foster systematic evaluation, we propose ClimaBench, the first comprehensive benchmark for real-world climate discovery. It encompasses challenging problems spanning 5 distinct task categories derived from professional scenarios between 2000 and 2025. Experiments on ClimaBench demonstrate that ClimAgent significantly outperforms state-of-the-art baselines, achieving a 40.21% improvement over original LLM solutions in solution rigorousness and practicality. Our code are available at https://github.com/usail-hkust/ClimAgent.

Problem

Research questions and friction points this paper is trying to address.

climate science

large language models

autonomous analysis

scientific discovery

data complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

ClimAgent

Large Language Models

autonomous agents

climate science analysis

ClimaBench

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Engineer - Agentic AI

Apple

Sunnyvale, United States of America

Authors to Follow