🤖 AI Summary
Existing robotic manipulation methods for soft/deformable objects exhibit poor generalization, struggling to adapt to unseen external forces or novel objects. This paper introduces the Template-Enhanced Shape-space Deformer (TESD), the first framework unifying visual and tactile modalities into a single deformable representation space that generalizes across objects and applied forces. TESD integrates multimodal perception encoding, differentiable template-based deformation modeling, and geometry-aware regularization robust to sensor anomalies. The approach enables rapid adaptation and high-fidelity deformation reconstruction under previously unobserved forces and on unseen objects. Experiments demonstrate a 37% reduction in deformation reconstruction error under multi-force generalization scenarios, while maintaining real-time inference capability. By providing a high-fidelity, cross-object and cross-force deformable prior, TESD advances dexterous manipulation of soft materials.
📝 Abstract
Accurate modelling of object deformations is crucial for a wide range of robotic manipulation tasks, where interacting with soft or deformable objects is essential. Current methods struggle to generalise to unseen forces or adapt to new objects, limiting their utility in real-world applications. We propose Shape-Space Deformer, a unified representation for encoding a diverse range of object deformations using template augmentation to achieve robust, fine-grained reconstructions that are resilient to outliers and unwanted artefacts. Our method improves generalization to unseen forces and can rapidly adapt to novel objects, significantly outperforming existing approaches. We perform extensive experiments to test a range of force generalisation settings and evaluate our method's ability to reconstruct unseen deformations, demonstrating significant improvements in reconstruction accuracy and robustness. Our approach is suitable for real-time performance, making it ready for downstream manipulation applications.