🤖 AI Summary
Soft-bodied object manipulation (e.g., ropes, fabrics) suffers from heavy reliance on numerous physical demonstrations and poor generalization. To address this, we propose a parameter-aware one-shot imitation learning framework. Our method integrates differentiable physics simulation–driven parameter estimation from a single real-world demonstration, parameter-conditioned policy networks, point-cloud-to-mesh density alignment, and joint simulation-to-real training—enabling zero-shot transfer across object categories. Crucially, we introduce the first approach that jointly couples physical parameter identification and policy learning within a single demonstration, eliminating the need for multiple demonstrations or category-specific priors. Experiments demonstrate significant improvements: in simulation, rope manipulation success rates increase by 62% (in-distribution) and 15% (out-of-distribution); in real-world settings, success rates rise by 26% for rope and 50% for fabric manipulation.
📝 Abstract
Due to the inherent uncertainty in their deformability during motion, previous methods in deformable object manipulation, such as rope and cloth, often required hundreds of real-world demonstrations to train a manipulation policy for each object, which hinders their applications in our ever-changing world. To address this issue, we introduce GenDOM, a framework that allows the manipulation policy to handle different deformable objects with only a single real-world demonstration. To achieve this, we augment the policy by conditioning it on deformable object parameters and training it with a diverse range of simulated deformable objects so that the policy can adjust actions based on different object parameters. At the time of inference, given a new object, GenDOM can estimate the deformable object parameters with only a single real-world demonstration by minimizing the disparity between the grid density of point clouds of real-world demonstrations and simulations in a differentiable physics simulator. Empirical validations on both simulated and real-world object manipulation setups clearly show that our method can manipulate different objects with a single demonstration and significantly outperforms the baseline in both environments (a 62% improvement for in-domain ropes and a 15% improvement for out-of-distribution ropes in simulation, as well as a 26% improvement for ropes and a 50% improvement for cloths in the real world), demonstrating the effectiveness of our approach in one-shot deformable object manipulation. https://sites.google.com/view/gendom/home.