Fast Policy Learning for 6-DOF Position Control of Underwater Vehicles

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the insufficient reliability of six-degree-of-freedom (6-DOF) pose control for autonomous underwater vehicles (AUVs) in complex, dynamic marine environments—where conventional controllers exhibit poor disturbance rejection and reinforcement learning (RL) suffers from slow training and challenging sim-to-real transfer—this work proposes an efficient and transferable RL control framework. Methodologically, it introduces a zero-shot RL policy transfer to a physical AUV platform, achieving the first full 6-DOF pose control validation in real underwater settings. It further devises a GPU-accelerated, jointly just-in-time (JIT)-compiled training paradigm leveraging JAX and MuJoCo-XLA (MJX), reducing single training time to under two minutes. Experimentally, the system demonstrates high-precision trajectory tracking and robust suppression of environmental disturbances, validating both its control accuracy and adaptability to realistic oceanic conditions.

Technology Category

Application Category

📝 Abstract
Autonomous Underwater Vehicles (AUVs) require reliable six-degree-of-freedom (6-DOF) position control to operate effectively in complex and dynamic marine environments. Traditional controllers are effective under nominal conditions but exhibit degraded performance when faced with unmodeled dynamics or environmental disturbances. Reinforcement learning (RL) provides a powerful alternative but training is typically slow and sim-to-real transfer remains challenging. This work introduces a GPU-accelerated RL training pipeline built in JAX and MuJoCo-XLA (MJX). By jointly JIT-compiling large-scale parallel physics simulation and learning updates, we achieve training times of under two minutes.Through systematic evaluation of multiple RL algorithms, we show robust 6-DOF trajectory tracking and effective disturbance rejection in real underwater experiments, with policies transferred zero-shot from simulation. Our results provide the first explicit real-world demonstration of RL-based AUV position control across all six degrees of freedom.
Problem

Research questions and friction points this paper is trying to address.

Develops fast reinforcement learning for 6-DOF AUV position control.
Addresses slow training and sim-to-real transfer challenges in RL.
Demonstrates robust trajectory tracking and disturbance rejection underwater.
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated RL training pipeline in JAX and MJX
JIT-compiled parallel simulation and learning updates
Zero-shot policy transfer from simulation to real AUVs
🔎 Similar Papers
No similar papers found.
S
Sümer Tunçay
School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh, United Kingdom
A
Alain Andres
TECNALIA, BRTA, San Sebastian, Spain
Ignacio Carlucho
Ignacio Carlucho
Assistant Professor, Heriot-Watt University
RoboticsReinforcement LearningMulti-agent systemsMachine Learning