🤖 AI Summary
This work addresses the challenge of insufficient force control precision in manipulation of fragile or deformable objects. We propose FARM, a Force-Conditioned Diffusion Policy framework leveraging high-dimensional tactile feedback. FARM fuses GelSight Mini tactile sensing with visual inputs to establish a force-grounded imitation learning paradigm, directly mapping tactile signals to a dynamic force-regulation action space. Unlike conventional end-to-end learning or model-predictive control approaches, FARM employs a diffusion model to explicitly capture the joint distribution of force and tactile observations, enabling fine-grained and robust grasp force generation. Evaluated on three distinct force-critical tasks—light contact, stable grasping, and deformation-aware manipulation—FARM consistently outperforms behavioral cloning, BC-Z, and TD3-based tactile baselines. Results demonstrate FARM’s effectiveness in achieving precise, adaptive force control under complex, unstructured contact conditions, as well as its strong generalization capability across diverse manipulation scenarios.
📝 Abstract
Contact-rich manipulation depends on applying the correct grasp forces throughout the manipulation task, especially when handling fragile or deformable objects. Most existing imitation learning approaches often treat visuotactile feedback only as an additional observation, leaving applied forces as an uncontrolled consequence of gripper commands. In this work, we present Force-Aware Robotic Manipulation (FARM), an imitation learning framework that integrates high-dimensional tactile data to infer tactile-conditioned force signals, which in turn define a matching force-based action space. We collect human demonstrations using a modified version of the handheld Universal Manipulation Interface (UMI) gripper that integrates a GelSight Mini visual tactile sensor. For deploying the learned policies, we developed an actuated variant of the UMI gripper with geometry matching our handheld version. During policy rollouts, the proposed FARM diffusion policy jointly predicts robot pose, grip width, and grip force. FARM outperforms several baselines across three tasks with distinct force requirements -- high-force, low-force, and dynamic force adaptation -- demonstrating the advantages of its two key components: leveraging force-grounded, high-dimensional tactile observations and a force-based control space. The codebase and design files are open-sourced and available at https://tactile-farm.github.io .