🤖 AI Summary
This work addresses the low data efficiency of imitation learning in garment hook insertion tasks caused by insufficiently precise state information. To overcome this limitation, the authors integrate sensors directly into the object to obtain high-fidelity state signals and combine teleoperated demonstrations with diffusion policy training. They demonstrate for the first time that black-box imitation learning policies can spontaneously prioritize instrumental signals over visual inputs. Furthermore, by distilling expert trajectories enhanced with instrumental feedback, they derive a high-performance vision-only policy that surpasses conventional visual imitation learning approaches. Experimental results show that the instrumented policy improves task success rates by 14–25 percentage points over purely visual baselines, while the distilled vision-only student policy achieves performance on par with the instrumented expert.
📝 Abstract
Large behaviour models have transformed the field of robotic manipulation, but prohibitive data requirements have thus far prevented a revolution similar to vision language models. We believe that instrumentation, i.e. sensor integration in objects, can provide invaluable state information and enable efficient learning for robotic manipulation. In this paper, we present instrumented imitation learning of clothes hanger insertion. Using 180 teleoperated demonstrations, we train diffusion policies with and without access to instrumentation data. Results show that policies leveraging instrumentation outperform vision-only counterparts by 14-25 %pt and exhibit greater task awareness. Crucially, a black-box imitation learning policy learns to prioritise instrumentation signals without explicit guidance. In addition, enhancing the teleoperation dataset with rollouts from an instrumented expert policy, enables a vision-only student policy to achieve performance comparable to the instrumented expert, thereby surpassing the original vision-only policy. These findings establish instrumentation as a promising strategy to enhance imitation learning for robotic manipulation. Datasets are available on Zenodo.