STReasoner: Empowering LLMs for Spatio-Temporal Reasoning in Time Series via Spatial-Aware Reinforcement Learning

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Existing approaches primarily focus on prediction while lacking explicit joint reasoning over spatiotemporal dynamics, spatial dependencies, and textual context, limiting their applicability in high-stakes domains such as transportation and power grids. To address this gap, this work proposes STReasoner, a novel framework that introduces ST-Bench—the first multitask benchmark tailored for spatiotemporal reasoning—and develops a spatial-aware Group Relative Policy Optimization (S-GRPO) algorithm. S-GRPO uniquely incorporates spatial information gain as a reward signal in reinforcement learning to guide large language models in explicitly fusing time series, graph structures, and textual inputs for coherent reasoning. Experiments demonstrate that the proposed method achieves average accuracy improvements of 17%–135% across multiple tasks, operates at only 0.004× the cost of commercial models, and exhibits strong generalization capabilities on real-world data.

Technology Category

Application Category

📝 Abstract

Spatio-temporal reasoning in time series involves the explicit synthesis of temporal dynamics, spatial dependencies, and textual context. This capability is vital for high-stakes decision-making in systems such as traffic networks, power grids, and disease propagation. However, the field remains underdeveloped because most existing works prioritize predictive accuracy over reasoning. To address the gap, we introduce ST-Bench, a benchmark consisting of four core tasks, including etiological reasoning, entity identification, correlation reasoning, and in-context forecasting, developed via a network SDE-based multi-agent data synthesis pipeline. We then propose STReasoner, which empowers LLM to integrate time series, graph structure, and text for explicit reasoning. To promote spatially grounded logic, we introduce S-GRPO, a reinforcement learning algorithm that rewards performance gains specifically attributable to spatial information. Experiments show that STReasoner achieves average accuracy gains between 17% and 135% at only 0.004X the cost of proprietary models and generalizes robustly to real-world data.

Problem

Research questions and friction points this paper is trying to address.

spatio-temporal reasoning

time series

explicit reasoning

spatial dependencies

temporal dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

spatio-temporal reasoning

large language models

reinforcement learning