ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited reasoning capability and accuracy of existing methods in vessel trajectory prediction under complex maritime scenarios by proposing a novel approach that reformulates the task as a text-to-text generation problem using the large language model Qwen3. The method introduces dynamic prompting to guide adaptive chain-of-thought reasoning, integrates a comprehensive reward function grounded in maritime navigation rules, and employs Group Relative Policy Optimization (GRPO) for reinforcement fine-tuning. By innovatively combining domain-specific prompts, rule-driven rewards, and the GRPO algorithm, the proposed framework significantly outperforms current deep learning and LLM-based baselines on two real-world, complex maritime datasets, achieving the lowest prediction error reported to date.

Technology Category

Application Category

📝 Abstract
Recent advancements in reinforcement fine-tuning have significantly improved the reasoning ability of large language models (LLMs). In particular, methods such as group relative policy optimization (GRPO) have demonstrated strong capabilities across various fields. However, applying LLMs to ship trajectory prediction remains largely unexplored. In this paper, we propose ShipTraj-R1, a novel LLM-based framework that reformulates ship trajectory prediction as a text-to-text generation problem. (1) We design a dynamic prompt containing trajectory information about conflicting ships to guide the model to achieve adaptive chain-of-thought (CoT) reasoning. (2) We introduce a comprehensive rule-based reward mechanism to incentivize the reasoning format and prediction accuracy of the model. (3) Our ShipTraj-R1 is reinforced through the GRPO mechanism guided by domain-specific prompts and rewards, and utilizes the Qwen3 as the model backbone. Extensive experimental results on two complex and real-world maritime datasets show that the proposed ShipTraj-R1 achieves the least error compared with state-of-the-art deep learning and LLM-based baselines.
Problem

Research questions and friction points this paper is trying to address.

ship trajectory prediction
large language models
reinforcement fine-tuning
maritime datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

ShipTraj-R1
group relative policy optimization
trajectory prediction
large language models
chain-of-thought reasoning
🔎 Similar Papers
No similar papers found.
Yang Zhan
Yang Zhan
iOPEN, Northwestern Polytechnical University, China
Geospatial AISpatio-Temporal AIVision-Language MultiModalLarge Language Model
Y
Yunhao Li
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
Zhang Chao
Zhang Chao
School of Mathematics and Statistics, Jiangsu Normal University
Y
Yuxu Lu
Department of Logistics and Maritime Studies, The Hong Kong Polytechnic University, Hong Kong 999077, China
Y
Yan Li
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China