🤖 AI Summary
This paper addresses the fundamental challenge that large language models (LLMs) struggle to infer interpretable, structured events from purely numerical time series—a bottleneck stemming from the absence of textual supervision, data scarcity, and semantic misalignment. To this end, the authors introduce “numerical-to-event reasoning” as a novel task. Their solution is a reasoning-aware two-stage framework: first, an Agent-Guided Event Extractor (AGE) identifies latent events without textual grounding; second, an Event-Driven Time-Series generator (EveDTS), built upon a tokenized multivariate Hawkes process, models temporal event dependencies and is fine-tuned via temporal encoding and structured decoding. Evaluated across diverse real-world domains, the method significantly outperforms strong LLM baselines, achieving state-of-the-art performance in both event-level precision and recall. It enables interpretable, semantics-aware decoding from quantitative signals to human-understandable events.
📝 Abstract
Large language models (LLMs) have recently demonstrated impressive multimodal reasoning capabilities, yet their understanding of purely numerical time-series signals remains limited. Existing approaches mainly focus on forecasting or trend description, without uncovering the latent events that drive numerical changes or explaining the reasoning process behind them. In this work, we introduce the task of number-to-event reasoning and decoding, which aims to infer interpretable structured events from numerical inputs, even when current text is unavailable. To address the data scarcity and semantic alignment challenges, we propose a reasoning-aware framework that integrates an agent-guided event extractor (AGE), a marked multivariate Hawkes-based synthetic generator (EveDTS), and a two-stage fine-tuning pipeline combining a time-series encoder with a structured decoder. Our model explicitly reasons over numerical changes, generates intermediate explanations, and outputs structured event hypotheses. Experiments on multi-domain datasets show that our method substantially outperforms strong LLM baselines in event-level precision and recall. These results suggest a new direction for bridging quantitative reasoning and semantic understanding, enabling LLMs to explain and predict events directly from numerical dynamics.