Climate Surrogates for Scalable Multi-Agent Reinforcement Learning: A Case Study with CICERO-SCM

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

High-fidelity multi-gas temperature models (e.g., CICERO-SCM) incur prohibitive computational costs, hindering their integration into reinforcement learning (RL) frameworks for climate policy optimization. Method: We propose a lightweight, recurrent neural network–based climate surrogate model, pre-trained on 20,000 multi-gas emission trajectories from CICERO-SCM. Contribution/Results: The surrogate achieves high-fidelity global mean temperature prediction (RMSE ≈ 0.0004 K) while accelerating inference by ~1000× and converging to the same optimal policy as the original simulator. Integrated into a multi-agent RL framework, it enables real-time temperature response modeling within the environmental loop. End-to-end training efficiency improves by over 100×, facilitating scalable, multi-scenario, regional climate policy co-optimization. This significantly advances the frontier of scalable climate-aware intelligent agents.

Technology Category

Application Category

📝 Abstract

Climate policy studies require models that capture the combined effects of multiple greenhouse gases on global temperature, but these models are computationally expensive and difficult to embed in reinforcement learning. We present a multi-agent reinforcement learning (MARL) framework that integrates a high-fidelity, highly efficient climate surrogate directly in the environment loop, enabling regional agents to learn climate policies under multi-gas dynamics. As a proof of concept, we introduce a recurrent neural network architecture pretrained on ($20{,}000$) multi-gas emission pathways to surrogate the climate model CICERO-SCM. The surrogate model attains near-simulator accuracy with global-mean temperature RMSE $approx 0.0004 mathrm{K}$ and approximately $1000 imes$ faster one-step inference. When substituted for the original simulator in a climate-policy MARL setting, it accelerates end-to-end training by $>!100 imes$. We show that the surrogate and simulator converge to the same optimal policies and propose a methodology to assess this property in cases where using the simulator is intractable. Our work allows to bypass the core computational bottleneck without sacrificing policy fidelity, enabling large-scale multi-agent experiments across alternative climate-policy regimes with multi-gas dynamics and high-fidelity climate response.

Problem

Research questions and friction points this paper is trying to address.

Developing efficient climate surrogates for multi-agent reinforcement learning

Accelerating climate policy training while maintaining accuracy

Enabling scalable experiments with multi-gas dynamics in climate models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates efficient climate surrogate in MARL loop

Uses pretrained neural network for multi-gas dynamics

Accelerates training while maintaining policy fidelity

🔎 Similar Papers

Carbon Footprint Reduction for Sustainable Data Centers in Real-Time