Real-Time Reinforcement Learning for Dynamic Tasks with a Parallel Soft Robot

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

Soft robotic systems face significant challenges in closed-loop control for dynamic tasks, including strong nonlinear responses, underutilization of configuration space, and poor sample efficiency and initialization sensitivity of data-driven methods. This paper proposes a real-time reinforcement learning (RL) control framework for dynamic balancing tasks, implemented on a deformable Stewart platform actuated by motor-driven chiral shear-altering (HSA) soft actuators. The approach employs a model-free Maximum Diffusion RL algorithm for end-to-end control. A novel curriculum learning strategy is introduced—centered at the known equilibrium point with an expanding neighborhood—enabling, for the first time, arbitrary-coordinate dynamic balancing within a single hardware deployment. The method achieves robust learning even under 50% actuator failure and requires only 15 minutes of hardware training without prior data. Performance approaches that of the intact platform, demonstrating substantial improvements in robustness, adaptability, and sample efficiency.

Technology Category

Application Category

📝 Abstract

Closed-loop control remains an open challenge in soft robotics. The nonlinear responses of soft actuators under dynamic loading conditions limit the use of analytic models for soft robot control. Traditional methods of controlling soft robots underutilize their configuration spaces to avoid nonlinearity, hysteresis, large deformations, and the risk of actuator damage. Furthermore, episodic data-driven control approaches such as reinforcement learning (RL) are traditionally limited by sample efficiency and inconsistency across initializations. In this work, we demonstrate RL for reliably learning control policies for dynamic balancing tasks in real-time single-shot hardware deployments. We use a deformable Stewart platform constructed using parallel, 3D-printed soft actuators based on motorized handed shearing auxetic (HSA) structures. By introducing a curriculum learning approach based on expanding neighborhoods of a known equilibrium, we achieve reliable single-deployment balancing at arbitrary coordinates. In addition to benchmarking the performance of model-based and model-free methods, we demonstrate that in a single deployment, Maximum Diffusion RL is capable of learning dynamic balancing after half of the actuators are effectively disabled, by inducing buckling and by breaking actuators with bolt cutters. Training occurs with no prior data, in as fast as 15 minutes, with performance nearly identical to the fully-intact platform. Single-shot learning on hardware facilitates soft robotic systems reliably learning in the real world and will enable more diverse and capable soft robots.

Problem

Research questions and friction points this paper is trying to address.

Closed-loop control remains challenging for soft robots due to nonlinear actuator responses.

Traditional control methods underutilize configuration spaces to avoid damage and hysteresis.

Reinforcement learning struggles with sample efficiency and inconsistent initialization in robotics.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-time reinforcement learning for dynamic balancing tasks

Curriculum learning with expanding equilibrium neighborhoods

Maximum Diffusion RL enabling single-deployment actuator failure recovery

🔎 Similar Papers

Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning