Exp-Force: Experience-Conditioned Pre-Grasp Force Selection with Vision-Language Models

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This study addresses the challenge of accurately setting a safe grasping force for compliant grippers prior to contact, avoiding object damage from excessive force or grasp failure due to insufficient force. The authors propose a model-free, few-shot approach that retrieves a small set of historical grasping experiences as contextual examples and leverages a vision-language model to directly predict the minimal feasible grasping force from a single RGB image, without explicit contact modeling or handcrafted heuristics. This work is the first to integrate experience-conditioned learning with vision-language models for pre-contact force prediction, significantly enhancing generalization. Experiments demonstrate an average absolute error of 0.43 N across 129 objects—representing a 72% reduction compared to zero-shot inference—and improve the success rate of selecting appropriate grasping forces on 30 unseen objects from 63% to 87%.

Technology Category

Application Category

📝 Abstract

Accurate pre-contact grasp force selection is critical for safe and reliable robotic manipulation. Adaptive controllers regulate force after contact but still require a reasonable initial estimate. Starting a grasp with too little force requires reactive adjustment, while starting a grasp with too high a force risks damaging fragile objects. This trade-off is particularly challenging for compliant grippers, whose contact mechanics are difficult to model analytically. We propose Exp-Force, an experience-conditioned framework that predicts the minimum feasible grasping force from a single RGB image. The method retrieves a small set of relevant prior grasping experiences and conditions a vision-language model on these examples for in-context inference, without analytic contact models or manually designed heuristics. On 129 object instances, ExpForce achieves a best-case MAE of 0.43 N, reducing error by 72% over zero-shot inference. In real-world tests on 30 unseen objects, it improves appropriate force selection rate from 63% to 87%. These results demonstrate that Exp-Force enables reliable and generalizable pre-grasp force selection by leveraging prior interaction experiences. http://expforcesubmission.github.io/Exp-Force-Website/

Problem

Research questions and friction points this paper is trying to address.

pre-grasp force selection

robotic manipulation

compliant grippers

vision-language models

grasping force prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

experience-conditioned learning

vision-language models

pre-grasp force prediction

in-context learning

robotic manipulation

🔎 Similar Papers

No similar papers found.

Toyota Research Institute

Los Altos, CA / Cambridge, MA

Authors to Follow