🤖 AI Summary
This work addresses the cross-domain manipulation challenge of deformable linear objects (DLOs) in Real2Sim2Real transfer. We propose an end-to-end vision-driven framework: first, likelihood-free inference (LFI) is employed to perform Bayesian posterior estimation of DLO physical parameters directly from real-world interaction videos; second, domain randomization in simulation is conducted using the inferred parameter distribution, combined with model-agnostic reinforcement learning and vision–proprioception fusion; finally, the policy achieves zero-shot transfer—deploying simulation-trained policies on real robots for DLO reaching tasks without fine-tuning. To our knowledge, this is the first approach coupling distributed physical parameter inference with domain randomization to enable generalization across DLO categories. We further introduce a novel fine-grained evaluation paradigm based on dynamic manipulation trajectories. Experiments demonstrate stable zero-shot manipulation across diverse DLO types and significantly improved classification accuracy.
📝 Abstract
We present an integrated (or end-to-end) framework for the Real2Sim2Real problem of manipulating deformable linear objects (DLOs) based on visual perception. Working with a parameterised set of DLOs, we use likelihood-free inference (LFI) to compute the posterior distributions for the physical parameters using which we can approximately simulate the behaviour of each specific DLO. We use these posteriors for domain randomisation while training, in simulation, object-specific visuomotor policies for a visuomotor DLO reaching task, using model-free reinforcement learning. We demonstrate the utility of this approach by deploying sim-trained DLO manipulation policies in the real world in a zero-shot manner, i.e. without any further fine-tuning. In this context, we evaluate the capacity of a prominent LFI method to perform fine classification over the parametric set of DLOs, using only visual and proprioceptive data obtained in a dynamic manipulation trajectory. We then study the implications of the resulting domain distributions in sim-based policy learning and real-world performance.