Robotic Compliant Object Prying Using Diffusion Policy Guided by Vision and Force Observations

📅 2025-03-06

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

Precise visuo-tactile coordination is critical for disassembling flexible components—such as lithium-ion batteries—in recycling scenarios, yet existing methods suffer from poor multimodal fusion and limited generalization in contact-rich, unstructured environments. Method: We propose a diffusion-based policy framework featuring cross-dimensional force–vision alignment. It is the first to effectively incorporate six-axis force feedback into diffusion policies, leveraging a multimodal feature alignment encoder for real-time tactile–visual observation fusion and enabling end-to-end skill learning. Contribution/Results: Evaluated on real-world battery prying tasks, our method achieves a 96% success rate—outperforming vision-only baselines by 57%. Crucially, it supports zero-shot transfer to unseen battery types and objects without retraining. By addressing key bottlenecks in multimodal integration and generalization under dense physical interaction, the framework delivers a scalable, embodied intelligence solution for compliant manipulation of deformable objects.

Technology Category

Application Category

📝 Abstract

The growing adoption of batteries in the electric vehicle industry and various consumer products has created an urgent need for effective recycling solutions. These products often contain a mix of compliant and rigid components, making robotic disassembly a critical step toward achieving scalable recycling processes. Diffusion policy has emerged as a promising approach for learning low-level skills in robotics. To effectively apply diffusion policy to contact-rich tasks, incorporating force as feedback is essential. In this paper, we apply diffusion policy with vision and force in a compliant object prying task. However, when combining low-dimensional contact force with high-dimensional image, the force information may be diluted. To address this issue, we propose a method that effectively integrates force with image data for diffusion policy observations. We validate our approach on a battery prying task that demands high precision and multi-step execution. Our model achieves a 96% success rate in diverse scenarios, marking a 57% improvement over the vision-only baseline. Our method also demonstrates zero-shot transfer capability to handle unseen objects and battery types. Supplementary videos and implementation codes are available on our project website. https://rros-lab.github.io/diffusion-with-force.github.io/

Problem

Research questions and friction points this paper is trying to address.

Develops robotic disassembly for battery recycling using diffusion policy.

Integrates vision and force feedback to enhance precision in prying tasks.

Achieves high success rate and zero-shot transfer to unseen objects.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates vision and force for robotic prying

Uses diffusion policy for multi-step precision tasks

Achieves high success rate with zero-shot transfer

🔎 Similar Papers

No similar papers found.