🤖 AI Summary
This study addresses the challenge of achieving high-precision, long-horizon ship trajectory prediction in real-world marine environments, where prolonged temporal dependencies and dynamic factors—such as ocean currents, wind, and waves—significantly complicate forecasting. To tackle this, the authors propose a hierarchical two-stage prediction framework: a long-term branch encodes navigational intent, while a short-term branch models local dynamics using a spatiotemporal graph Transformer. These branches are adaptively fused through an environment-aware cross-modal attention mechanism that incorporates oceanographic parameters. The approach further introduces a feature-level modulation strategy and a learnable Savitzky–Golay smoothing layer to enhance temporal consistency and prediction accuracy. Evaluated on the Australian CTS dataset with 3-hour inputs and 10-hour forecasts, the method reduces average displacement error (ADE) and final displacement error (FDE) by 25% and 17%, respectively, outperforming current state-of-the-art approaches.
📝 Abstract
Long-horizon vessel trajectory forecasting under real ocean conditions is critical for collision avoidance, traffic management, and route planning. However, achieving accurate predictions is challenging due to long-range temporal dependencies and dynamic environmental factors such as currents, wind, and waves. To address these issues, we propose a hierarchical two-stage framework that combines a coarse long-term predictor with a grid-aware short-term predictor through a hierarchical fusion mechanism. The short-term branch leverages a Spatio-Temporal Graph Transformer on discretized maritime cells to capture localized dynamics, while the long-term branch encodes overarching navigational intent. An integrated environmental module incorporates oceanographic parameters, including surface currents, wind vectors, and significant wave height, using cross-modal attention and feature-wise modulation for adaptive response to varying sea conditions. Additionally, a learnable Savitzky-Golay smoothing layer enhances temporal coherence in fused trajectories. We evaluate our approach on Australian Craft Tracking System (CTS) data from the North West region, aligned with Copernicus Marine Service products, using a 3-hour input and a 10-hour prediction horizon. Experimental results show that our framework outperforms the state-of-the-art by 25% in Average Displacement Error (ADE) and 17% in Final Displacement Error (FDE). Ablation studies further validate the contribution of each component.