Semantic-Contact Fields for Category-Level Generalizable Tactile Tool Manipulation

📅 2026-02-14

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing robotic tool manipulation approaches struggle to simultaneously achieve semantic-level planning and high-fidelity contact control, and often exhibit limited generalization. This work proposes Semantic-Contact Fields (SCFields), a unified 3D representation that jointly encodes semantic and contact information. By integrating visual semantics with dense contact estimation through a two-stage simulation-to-reality contact learning pipeline, SCFields provide high-dimensional tactile observations for diffusion-based policies. The method combines geometric heuristics, force optimization, and few-shot real-world data alignment to enable robust tool use. Evaluated on scraping, crayon drawing, and peeling tasks, SCFields significantly outperform vision-only and raw tactile baselines, demonstrating category-level cross-tool generalization and robust manipulation of previously unseen tools.

Technology Category

Application Category

📝 Abstract

Generalizing tool manipulation requires both semantic planning and precise physical control. Modern generalist robot policies, such as Vision-Language-Action (VLA) models, often lack the high-fidelity physical grounding required for contact-rich tool manipulation. Conversely, existing contact-aware policies that leverage tactile or haptic sensing are typically instance-specific and fail to generalize across diverse tool geometries. Bridging this gap requires learning unified contact representations from diverse data, yet a fundamental barrier remains: diverse real-world tactile data are prohibitive at scale, while direct zero-shot sim-to-real transfer is challenging due to the complex dynamics of nonlinear deformation of soft sensors. To address this, we propose Semantic-Contact Fields (SCFields), a unified 3D representation fusing visual semantics with dense contact estimates. We enable this via a two-stage Sim-to-Real Contact Learning Pipeline: first, we pre-train on a large simulation data set to learn general contact physics; second, we fine-tune on a small set of real data, pseudo-labeled via geometric heuristics and force optimization, to align sensor characteristics. This allows physical generalization to unseen tools. We leverage SCFields as the dense observation input for a diffusion policy to enable robust execution of contact-rich tool manipulation tasks. Experiments on scraping, crayon drawing, and peeling demonstrate robust category-level generalization, significantly outperforming vision-only and raw-tactile baselines.

Problem

Research questions and friction points this paper is trying to address.

tool manipulation

tactile sensing

category-level generalization

contact-rich interaction

sim-to-real transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-Contact Fields

Sim-to-Real Transfer

Tactile Manipulation

Category-Level Generalization

Diffusion Policy

🔎 Similar Papers

RoTip: A Finger-Shaped Tactile Sensor with Active Rotation

2024-10-01arXiv.orgCitations: 2