🤖 AI Summary
Pseudo-alignment—the misalignment between pretrained language model embeddings and time-series structures—is the primary cause of inferior performance of LLM4TS methods relative to linear baselines. This stems from the “cone effect” inherent in language model embedding spaces, which conflicts with the low-dimensional manifold geometry intrinsic to time series. To address this, we propose TimeSUP, a manifold-lifting approach that actively expands temporal representations to match the intrinsic dimensionality of language embeddings while preserving modality-specific characteristics. By integrating manifold geometric priors with cross-modal representation learning, TimeSUP effectively mitigates the cone effect and enhances temporal discriminability. Experiments demonstrate that TimeSUP consistently outperforms existing LLM4TS methods on long-horizon forecasting benchmarks. Moreover, it is plug-and-play compatible with four mainstream LLM4TS frameworks, delivering uniform and significant improvements across all.
📝 Abstract
Pseudo-Alignment is a pervasive challenge in many large language models for time series (LLM4TS) models, often causing them to underperform compared to linear models or randomly initialised backbones. However, there is limited discussion in the community for the reasons that pseudo-alignment occurs. In this work, we conduct a thorough investigation into the root causes of pseudo-alignment in LLM4TS and build a connection of pseudo-alignment to the cone effect in LLM. We demonstrate that pseudo-alignment arises from the interplay of cone effect within pretrained LLM components and the intrinsically low-dimensional manifold of time-series data. In addition, we also introduce extit{ extbf{TimeSUP}}, a novel technique designed to mitigate this issue and improve forecast performance in existing LLM4TS approaches. TimeSUP addresses this by increasing the time series manifold to more closely match the intrinsic dimension of language embeddings, allowing the model to distinguish temporal signals clearly while still capturing shared structures across modalities. As a result, representations for time and language tokens remain distinct yet exhibit high cosine similarity, signifying that the model preserves each modality unique features while learning their commonalities in a unified embedding space. Empirically, TimeSUP consistently outperforms state-of-the-art LLM4TS methods and other lightweight baselines on long-term forecasting performance. Furthermore, it can be seamlessly integrated into four existing LLM4TS pipelines and delivers significant improvements in forecasting performance.