Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

๐Ÿ“… 2024-11-14
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing hand-object interaction reconstruction methods rely solely on vision, limiting their ability to model occlusions and deformable object dynamics. This paper proposes ViTaM-D, the first framework integrating vision with distributed tactile sensing for dynamic, high-fidelity interaction reconstruction. Its key contributions are: (1) DF-Fieldโ€”a novel implicit field that unifies distributed force perception by jointly encoding contact kinetic and potential energy; (2) HOTโ€”a first-of-its-kind high-fidelity simulation benchmark specifically designed for deformable object interaction; and (3) a synergistic pipeline comprising VDT-Net for initial reconstruction and a Force-aware Optimization (FO) algorithm for refinement. Evaluated on DexYCB and HOT, ViTaM-D reduces hand pose error by 23.6% and improves object deformation reconstruction PSNR by 5.8 dB over state-of-the-art methods including HOTrack and gSDF.

Technology Category

Application Category

๐Ÿ“ Abstract
We present ViTaM-D, a novel visual-tactile framework for dynamic hand-object interaction reconstruction, integrating distributed tactile sensing for more accurate contact modeling. While existing methods focus primarily on visual inputs, they struggle with capturing detailed contact interactions such as object deformation. Our approach leverages distributed tactile sensors to address this limitation by introducing DF-Field. This distributed force-aware contact representation models both kinetic and potential energy in hand-object interaction. ViTaM-D first reconstructs hand-object interactions using a visual-only network, VDT-Net, and then refines contact details through a force-aware optimization (FO) process, enhancing object deformation modeling. To benchmark our approach, we introduce the HOT dataset, which features 600 sequences of hand-object interactions, including deformable objects, built in a high-precision simulation environment. Extensive experiments on both the DexYCB and HOT datasets demonstrate significant improvements in accuracy over previous state-of-the-art methods such as gSDF and HOTrack. Our results highlight the superior performance of ViTaM-D in both rigid and deformable object reconstruction, as well as the effectiveness of DF-Field in refining hand poses. This work offers a comprehensive solution to dynamic hand-object interaction reconstruction by seamlessly integrating visual and tactile data. Codes, models, and datasets will be available.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing dynamic hand-object interaction with tactile sensing
Addressing occluded interactions and object deformation in vision-only methods
Improving contact modeling and hand pose refinement accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual-tactile framework with distributed tactile sensing
Force-aware contact representation using energy dynamics
Dataset featuring 600 hand-object interaction sequences
๐Ÿ”Ž Similar Papers
No similar papers found.