Climate Surrogates for Scalable Multi-Agent Reinforcement Learning: A Case Study with CICERO-SCM

πŸ“… 2025-10-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
High-fidelity multi-gas temperature models (e.g., CICERO-SCM) incur prohibitive computational costs, hindering their integration into reinforcement learning (RL) frameworks for climate policy optimization. Method: We propose a lightweight, recurrent neural network–based climate surrogate model, pre-trained on 20,000 multi-gas emission trajectories from CICERO-SCM. Contribution/Results: The surrogate achieves high-fidelity global mean temperature prediction (RMSE β‰ˆ 0.0004 K) while accelerating inference by ~1000Γ— and converging to the same optimal policy as the original simulator. Integrated into a multi-agent RL framework, it enables real-time temperature response modeling within the environmental loop. End-to-end training efficiency improves by over 100Γ—, facilitating scalable, multi-scenario, regional climate policy co-optimization. This significantly advances the frontier of scalable climate-aware intelligent agents.

Technology Category

Application Category

πŸ“ Abstract
Climate policy studies require models that capture the combined effects of multiple greenhouse gases on global temperature, but these models are computationally expensive and difficult to embed in reinforcement learning. We present a multi-agent reinforcement learning (MARL) framework that integrates a high-fidelity, highly efficient climate surrogate directly in the environment loop, enabling regional agents to learn climate policies under multi-gas dynamics. As a proof of concept, we introduce a recurrent neural network architecture pretrained on ($20{,}000$) multi-gas emission pathways to surrogate the climate model CICERO-SCM. The surrogate model attains near-simulator accuracy with global-mean temperature RMSE $approx 0.0004 mathrm{K}$ and approximately $1000 imes$ faster one-step inference. When substituted for the original simulator in a climate-policy MARL setting, it accelerates end-to-end training by $>!100 imes$. We show that the surrogate and simulator converge to the same optimal policies and propose a methodology to assess this property in cases where using the simulator is intractable. Our work allows to bypass the core computational bottleneck without sacrificing policy fidelity, enabling large-scale multi-agent experiments across alternative climate-policy regimes with multi-gas dynamics and high-fidelity climate response.
Problem

Research questions and friction points this paper is trying to address.

Developing efficient climate surrogates for multi-agent reinforcement learning
Accelerating climate policy training while maintaining accuracy
Enabling scalable experiments with multi-gas dynamics in climate models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates efficient climate surrogate in MARL loop
Uses pretrained neural network for multi-gas dynamics
Accelerates training while maintaining policy fidelity
πŸ”Ž Similar Papers
No similar papers found.
O
Oskar Bohn Lassen
Technical University of Denmark, Kongens Lyngby, Denmark
S
Serio Angelo Maria Agriesti
Technical University of Denmark, Kongens Lyngby, Denmark
Filipe Rodrigues
Filipe Rodrigues
Technical University of Denmark (DTU)
Machine LearningReinforcement LearningIntelligent Transportation SystemsUrban Mobility
F
Francisco Camara Pereira
Technical University of Denmark, Kongens Lyngby, Denmark