Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models

📅 2025-10-04

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Addressing the dual challenges of inherent stochasticity and non-differentiable evaluation metrics in physical spatiotemporal forecasting, this paper proposes a novel model-based reinforcement learning paradigm that reformulates prediction as sequential planning. Methodologically, we construct a generative world model to simulate high-fidelity, diverse future states and employ domain-specific non-differentiable metrics—such as extreme-event hit rate—as sparse reward signals. We design a beam-search–guided, reward-driven imagination mechanism and introduce an iterative pseudo-labeling self-training strategy. Crucially, our framework enables end-to-end optimization of non-differentiable objectives without gradient approximation. Experiments demonstrate substantial reductions in overall prediction error alongside marked improvements in long-tail event detection. This work establishes a new pathway toward interpretable and robust forecasting for complex physical systems.

Technology Category

Application Category

📝 Abstract

To address the dual challenges of inherent stochasticity and non-differentiable metrics in physical spatiotemporal forecasting, we propose Spatiotemporal Forecasting as Planning (SFP), a new paradigm grounded in Model-Based Reinforcement Learning. SFP constructs a novel Generative World Model to simulate diverse, high-fidelity future states, enabling an "imagination-based" environmental simulation. Within this framework, a base forecasting model acts as an agent, guided by a beam search-based planning algorithm that leverages non-differentiable domain metrics as reward signals to explore high-return future sequences. These identified high-reward candidates then serve as pseudo-labels to continuously optimize the agent's policy through iterative self-training, significantly reducing prediction error and demonstrating exceptional performance on critical domain metrics like capturing extreme events.

Problem

Research questions and friction points this paper is trying to address.

Addresses spatiotemporal forecasting challenges with stochasticity and non-differentiable metrics

Proposes model-based reinforcement learning with generative world simulation

Optimizes forecasting through planning algorithms using non-differentiable reward signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative World Model simulates diverse future states

Beam search planning uses non-differentiable metrics as rewards

Self-training optimizes policy with high-reward pseudo-labels

🔎 Similar Papers

No similar papers found.