🤖 AI Summary
This work investigates whether large language models pretrained on textual data can be effectively transferred to time series forecasting tasks and elucidates the underlying cross-modal transfer mechanism. The study reveals that during pretraining, language models implicitly construct a low-dimensional universal manifold whose geometric structure directly supports unsupervised time series reconstruction and prediction. Through linear probing, representation retrieval, subspace alignment, and loss landscape analysis, the authors demonstrate that high-quality trajectory decoding can be achieved with only low-dimensional alignment or minimal fine-tuning—without paired supervision or learning temporal patterns from scratch. Experimental results show that this approach attains competitive forecasting performance under fully unsupervised settings, highlighting the transferability and optimization efficiency of the pretrained manifold.
📝 Abstract
Can language-pretrained transformers become effective time-series forecasters, and why? In this paper, we show that cross-modal transfer arises because language pretraining preconditions time series training with a reusable manifold. A linear probe on frozen LLM states decodes realistic time-series trajectories without paired supervision, and retrieval in this projected space yields competitive forecasts, showing that structure and dynamics exist before finetuning. Pretrained initialization also improves optimization, producing coherent gradients and a highly anisotropic loss landscape unlike random initialization. Finetuning then acts as low-dimensional alignment, reusing existing directions rather than learning temporal primitives from scratch, as evidenced by low-rank updates, subspace alignment, and shared features for periodicity, trend, and repetition. Together, these results support a geometric account of LLM-to-time-series transfer: language pretraining builds the manifold, and finetuning projects numerical dynamics onto task-relevant directions.