🤖 AI Summary
To address the challenges of sparse trajectory modeling and insufficient contextual information exploitation in typhoon track forecasting, this paper proposes a language-enhanced multimodal Transformer framework. It introduces meteorological semantic descriptions—generated by large language models—as prompt tokens into trajectory prediction for the first time, enabling joint encoding of textual embeddings and numerical time-series data within a unified architecture to achieve deep semantic-temporal fusion. This approach effectively captures high-order dynamic priors—such as synoptic-scale circulation patterns and environmental steering mechanisms—that are poorly represented in conventional numerical data, thereby significantly improving robustness for nonlinear track abrupt changes and data-scarce scenarios. Evaluated on the HURDAT2 benchmark, our method reduces mean track error by 18.7% over current state-of-the-art models, with particularly pronounced gains in long-horizon predictions (48–96 hours).
📝 Abstract
Accurate typhoon track forecasting is crucial for early system warning and disaster response. While Transformer-based models have demonstrated strong performance in modeling the temporal dynamics of dense trajectories of humans and vehicles in smart cities, they usually lack access to broader contextual knowledge that enhances the forecasting reliability of sparse meteorological trajectories, such as typhoon tracks. To address this challenge, we propose TyphoFormer, a novel framework that incorporates natural language descriptions as auxiliary prompts to improve typhoon trajectory forecasting. For each time step, we use Large Language Model (LLM) to generate concise textual descriptions based on the numerical attributes recorded in the North Atlantic hurricane database. The language descriptions capture high-level meteorological semantics and are embedded as auxiliary special tokens prepended to the numerical time series input. By integrating both textual and sequential information within a unified Transformer encoder, TyphoFormer enables the model to leverage contextual cues that are otherwise inaccessible through numerical features alone. Extensive experiments are conducted on HURDAT2 benchmark, results show that TyphoFormer consistently outperforms other state-of-the-art baseline methods, particularly under challenging scenarios involving nonlinear path shifts and limited historical observations.