Swimming Under Constraints: A Safe Reinforcement Learning Framework for Quadrupedal Bio-Inspired Propulsion

📅 2026-03-04

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the lift oscillations and instability caused by six-degree-of-freedom fluid coupling in bio-inspired quadrupedal underwater propulsion. To balance propulsive efficiency and motion stability, the authors propose a safety-aware reinforcement learning approach that formulates quadrupedal swimming as a constrained optimization problem. They develop an Accelerated Constrained Proximal Policy Optimization algorithm with PID control (ACPPO-PID), where PID-regulated Lagrange multipliers enforce safety constraints. Training stability and convergence speed are enhanced through conditional asymmetric clipping and periodic geometric aggregation. Sim-to-real transfer is achieved via imitation learning. Experimental results demonstrate that the proposed method significantly improves thrust efficiency, effectively suppresses destabilizing disturbances, and exhibits strong robustness and generalization in free-swimming tasks on a real quadrupedal robot.

Technology Category

Application Category

📝 Abstract

Bio-inspired aquatic propulsion offers high thrust and maneuverability but is prone to destabilizing forces such as lift fluctuations, which are further amplified by six-degree-of-freedom (6-DoF) fluid coupling. We formulate quadrupedal swimming as a constrained optimization problem that maximizes forward thrust while minimizing destabilizing fluctuations. Our proposed framework, Accelerated Constrained Proximal Policy Optimization with a PID-regulated Lagrange multiplier (ACPPO-PID), enforces constraints with a PID-regulated Lagrange multiplier, accelerates learning via conditional asymmetric clipping, and stabilizes updates through cycle-wise geometric aggregation. Initialized with imitation learning and refined through on-hardware towing-tank experiments, ACPPO-PID produces control policies that transfer effectively to quadrupedal free-swimming trials. Results demonstrate improved thrust efficiency, reduced destabilizing forces, and faster convergence compared with state-of-the-art baselines, underscoring the importance of constraint-aware safe RL for robust and generalizable bio-inspired locomotion in complex fluid environments.

Problem

Research questions and friction points this paper is trying to address.

bio-inspired propulsion

quadrupedal swimming

destabilizing forces

constrained optimization

fluid-structure interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constrained Reinforcement Learning

Bio-inspired Propulsion

PID-regulated Lagrange Multiplier