Robot Deformable Object Manipulation via NMPC-generated Demonstrations in Deep Reinforcement Learning

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low sample efficiency of reinforcement learning (RL) and the high cost of human demonstrations in deformable object manipulation (e.g., cloth), this paper proposes HGCR-DDPG—a hybrid framework integrating nonlinear model predictive control (NMPC)-generated low-cost simulation demonstrations with deep RL. Key contributions include: (1) an NMPC-driven, low-overhead demonstration acquisition method; (2) high-dimensional fuzzy-logic-guided grasp point selection; and (3) an enhanced behavioral cloning and sequential policy learning framework. In physics-based simulation, the method achieves a global average reward 2.01× higher than baseline methods, with a 55% reduction in reward variance. On real robotic hardware, it attains success rates of 83.3%, 80%, and 100% across three cloth manipulation tasks. The algorithm is lightweight, computationally efficient, and supports customizable deployment.

Technology Category

Application Category

📝 Abstract
In this work, we conducted research on deformable object manipulation by robots based on demonstration-enhanced reinforcement learning (RL). To improve the learning efficiency of RL, we enhanced the utilization of demonstration data from multiple aspects and proposed the HGCR-DDPG algorithm. It uses a novel high-dimensional fuzzy approach for grasping-point selection, a refined behavior-cloning method to enhance data-driven learning in Rainbow-DDPG, and a sequential policy-learning strategy. Compared to the baseline algorithm (Rainbow-DDPG), our proposed HGCR-DDPG achieved 2.01 times the global average reward and reduced the global average standard deviation to 45% of that of the baseline algorithm. To reduce the human labor cost of demonstration collection, we proposed a low-cost demonstration collection method based on Nonlinear Model Predictive Control (NMPC). Simulation experiment results show that demonstrations collected through NMPC can be used to train HGCR-DDPG, achieving comparable results to those obtained with human demonstrations. To validate the feasibility of our proposed methods in real-world environments, we conducted physical experiments involving deformable object manipulation. We manipulated fabric to perform three tasks: diagonal folding, central axis folding, and flattening. The experimental results demonstrate that our proposed method achieved success rates of 83.3%, 80%, and 100% for these three tasks, respectively, validating the effectiveness of our approach. Compared to current large-model approaches for robot manipulation, the proposed algorithm is lightweight, requires fewer computational resources, and offers task-specific customization and efficient adaptability for specific tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhance robot deformable object manipulation efficiency.
Reduce human labor in demonstration data collection.
Validate method in real-world deformable object tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

High-dimensional fuzzy grasping-point selection
Refined behavior-cloning in Rainbow-DDPG
Low-cost NMPC-based demonstration collection
🔎 Similar Papers
No similar papers found.
Haoyuan Wang
Haoyuan Wang
University of Pennsylvania, Applied Mathematics and Computational Science
Biostatistics
Zihao Dong
Zihao Dong
CS PhD Student, Northeastern University
VerificationRoboticsComputer Vision
H
Hongliang Lei
Hubei Key Laboratory of Brain-inspired Intelligent Systems and the Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology (HUST), Wuhan 430074, Hubei, China
Z
Zejia Zhang
Hubei Key Laboratory of Brain-inspired Intelligent Systems and the Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology (HUST), Wuhan 430074, Hubei, China
W
Weizhuang Shi
Hubei Key Laboratory of Brain-inspired Intelligent Systems and the Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology (HUST), Wuhan 430074, Hubei, China
W
Wei Luo
Department of Innovation Center, China Ship Development and Design Center, Wuhan 430064, Hubei, China
W
Weiwei Wan
Graduate School of Engineering Science, Osaka University, Toyonaka 560-0043, Japan
J
Jian Huang
Hubei Key Laboratory of Brain-inspired Intelligent Systems and the Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology (HUST), Wuhan 430074, Hubei, China