🤖 AI Summary
Existing time-series forecasting methods face two key bottlenecks when directly adapting large language models (LLMs): heavy reliance on large-scale paired text supervision and modality-incompatible representations arising from fundamental disparities between textual and time-series data. To address these challenges, we propose a text-agnostic hierarchical cross-modal alignment framework featuring a novel three-level (input/feature/output) alignment mechanism. Specifically, we generate synthetic word embeddings via QR decomposition and integrate learnable prompts to enable label-free, text-guided learning; further, we introduce temporal representation distillation and multi-granularity alignment to bridge the modality gap. Crucially, our method eliminates the need for manual text annotations. Evaluated on multiple standard time-series benchmarks, it achieves state-of-the-art performance, demonstrating significant improvements in both prediction accuracy and cross-dataset generalization capability.
📝 Abstract
Given the significant potential of large language models (LLMs) in sequence modeling, emerging studies have begun applying them to time-series forecasting. Despite notable progress, existing methods still face two critical challenges: 1) their reliance on large amounts of paired text data, limiting the model applicability, and 2) a substantial modality gap between text and time series, leading to insufficient alignment and suboptimal performance. In this paper, we introduce extbf{H}ierarchical extbf{T}ext- extbf{F}ree extbf{A}lignment ( extbf{TS-HTFA}), a novel method that leverages hierarchical alignment to fully exploit the representation capacity of LLMs while eliminating the dependence on text data. Specifically, we replace paired text data with adaptive virtual text based on QR decomposition word embeddings and learnable prompt. Furthermore, we establish comprehensive cross-modal alignment at three levels: input, feature, and output. Extensive experiments on multiple time-series benchmarks demonstrate that HTFA achieves state-of-the-art performance, significantly improving prediction accuracy and generalization.