🤖 AI Summary
Large language models (LLMs) exhibit limited capability in understanding time series data, hindering their application in temporal decision-making scenarios. To address this, this work proposes the first intermediate training paradigm specifically designed for time series understanding, introducing Book-of-Thoth—a high-quality, task- and domain-agnostic corpus of aligned time series–text pairs—and KnoTS, a new knowledge-intensive benchmark for time series comprehension. By leveraging bidirectional time series–text generation and an intermediate training strategy, the proposed method, Thoth, substantially outperforms existing models across multiple time series question-answering benchmarks. Notably, it demonstrates superior fine-tuning performance under data-scarce conditions, validating the effectiveness of the proposed paradigm in enhancing LLMs’ temporal reasoning and understanding capabilities.
📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics. In this paper, we propose Thoth, the first family of mid-trained LLMs with general-purpose time series understanding capabilities. As a pivotal intermediate stage, mid-training achieves task- and domain-agnostic alignment between time series and natural language, for which we construct Book-of-Thoth, a high-quality, time-series-centric mid-training corpus. Book-of-Thoth enables both time-series-to-text and text-to-time-series generation, equipping LLMs with a foundational grasp of temporal patterns. To better evaluate advanced reasoning capabilities, we further present KnoTS, a novel benchmark of knowledge-intensive time series understanding, designed for joint reasoning over temporal patterns and domain knowledge. Extensive experiments demonstrate that mid-training with Book-of-Thoth enables Thoth to significantly outperform its base model and advanced LLMs across a range of time series question answering benchmarks. Moreover, Thoth exhibits superior capabilities when fine-tuned under data scarcity, underscoring the effectiveness of mid-training for time series understanding. Code is available at: https://github.com/thuml/Thoth.