Swimming Under Constraints: A Safe Reinforcement Learning Framework for Quadrupedal Bio-Inspired Propulsion

📅 2026-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lift oscillations and instability caused by six-degree-of-freedom fluid coupling in bio-inspired quadrupedal underwater propulsion. To balance propulsive efficiency and motion stability, the authors propose a safety-aware reinforcement learning approach that formulates quadrupedal swimming as a constrained optimization problem. They develop an Accelerated Constrained Proximal Policy Optimization algorithm with PID control (ACPPO-PID), where PID-regulated Lagrange multipliers enforce safety constraints. Training stability and convergence speed are enhanced through conditional asymmetric clipping and periodic geometric aggregation. Sim-to-real transfer is achieved via imitation learning. Experimental results demonstrate that the proposed method significantly improves thrust efficiency, effectively suppresses destabilizing disturbances, and exhibits strong robustness and generalization in free-swimming tasks on a real quadrupedal robot.

Technology Category

Application Category

📝 Abstract
Bio-inspired aquatic propulsion offers high thrust and maneuverability but is prone to destabilizing forces such as lift fluctuations, which are further amplified by six-degree-of-freedom (6-DoF) fluid coupling. We formulate quadrupedal swimming as a constrained optimization problem that maximizes forward thrust while minimizing destabilizing fluctuations. Our proposed framework, Accelerated Constrained Proximal Policy Optimization with a PID-regulated Lagrange multiplier (ACPPO-PID), enforces constraints with a PID-regulated Lagrange multiplier, accelerates learning via conditional asymmetric clipping, and stabilizes updates through cycle-wise geometric aggregation. Initialized with imitation learning and refined through on-hardware towing-tank experiments, ACPPO-PID produces control policies that transfer effectively to quadrupedal free-swimming trials. Results demonstrate improved thrust efficiency, reduced destabilizing forces, and faster convergence compared with state-of-the-art baselines, underscoring the importance of constraint-aware safe RL for robust and generalizable bio-inspired locomotion in complex fluid environments.
Problem

Research questions and friction points this paper is trying to address.

bio-inspired propulsion
quadrupedal swimming
destabilizing forces
constrained optimization
fluid-structure interaction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constrained Reinforcement Learning
Bio-inspired Propulsion
PID-regulated Lagrange Multiplier
Quadrupedal Swimming
Safe RL
🔎 Similar Papers
No similar papers found.
X
Xinyu Cui
FSI lab, Westlake University, Hangzhou, 310030, China.; Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
Fei Han
Fei Han
National University of Singapore
Differential GeometryTopology and Mathematical Physics
H
Hang Xu
FSI lab, Westlake University, Hangzhou, 310030, China.
Yongcheng Zeng
Yongcheng Zeng
University of Chinese Academy of Sciences
LLMReinforcement Learning
Luoyang Sun
Luoyang Sun
Institute of Automation, Chinese Academy of Sciences
Machine Learning
R
Ruizhi Zhang
Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
Jian Zhao
Jian Zhao
Zhongguancun Institute of Artificial Intelligence
Reinforcement LearningMulti-Agent System
Haifeng Zhang
Haifeng Zhang
Institute of Automation, Chinese Academy of Sciences
reinforcement learningcomputational advertising
W
Weikun Li
FSI lab, Westlake University, Hangzhou, 310030, China.
Hao Chen
Hao Chen
Postdoc at Westlake University
Engineering optimizationSurrogate modelData-driven optimizationRobotic fishComplex network
Jun Wang
Jun Wang
Professor, Computer Science, University College London
Machine LearningMulti-agent LearningInformation RetrievalRecommender SystemsComputational Advertising
D
Dixia Fan
FSI lab, Westlake University, Hangzhou, 310030, China.