Visio-Verbal Teleimpedance Interface: Enabling Semi-Autonomous Control of Physical Interaction via Eye Tracking and Speech

📅 2025-08-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited intuitiveness and precision of human–robot physical interaction in stiffness teleoperation for remote robotics. We propose a semi-autonomous 3D stiffness ellipsoid control method integrating gaze and speech inputs. Real-time gaze tracking is performed using Tobii Pro Glasses 2, while a GPT-4o–driven vision-language model interprets multimodal commands (speech + visual context) to enable context-aware stiffness modulation of a KUKA LBR iiwa manipulator and Force Dimension Sigma.7 haptic device. To our knowledge, this is the first work introducing dual-modal (gaze–speech) input into stiffness teleoperation interfaces, establishing an end-to-end intent understanding and ellipsoid parameter mapping framework. Experimental results demonstrate that our prompting strategy significantly improves intent recognition accuracy; in a slot-following task, it supports multi-dimensional control—including stiffness center localization, axial scaling, and orientation adjustment—yielding a 23% improvement in task completion efficiency and a 37% increase in subjective intuitiveness ratings.

Technology Category

Application Category

📝 Abstract
The paper presents a visio-verbal teleimpedance interface for commanding 3D stiffness ellipsoids to the remote robot with a combination of the operator's gaze and verbal interaction. The gaze is detected by an eye-tracker, allowing the system to understand the context in terms of what the operator is currently looking at in the scene. Along with verbal interaction, a Visual Language Model (VLM) processes this information, enabling the operator to communicate their intended action or provide corrections. Based on these inputs, the interface can then generate appropriate stiffness matrices for different physical interaction actions. To validate the proposed visio-verbal teleimpedance interface, we conducted a series of experiments on a setup including a Force Dimension Sigma.7 haptic device to control the motion of the remote Kuka LBR iiwa robotic arm. The human operator's gaze is tracked by Tobii Pro Glasses 2, while human verbal commands are processed by a VLM using GPT-4o. The first experiment explored the optimal prompt configuration for the interface. The second and third experiments demonstrated different functionalities of the interface on a slide-in-the-groove task.
Problem

Research questions and friction points this paper is trying to address.

Enabling semi-autonomous robot control through eye tracking
Generating appropriate stiffness matrices for physical interaction
Processing verbal commands with visual language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Eye tracking for context understanding
Verbal commands processed by VLM
Generating stiffness matrices for interaction
🔎 Similar Papers
No similar papers found.
H
Henk H.A. Jekel
Department of Cognitive Robotics, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands.
A
Alejandro Díaz Rosales
Department of Cognitive Robotics, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands. European Organization for Nuclear Research (CERN), Espl. des Particules 1, 1211 Meyrin, Switzerland.
Luka Peternel
Luka Peternel
Delft University of Technology
TeleoperationPhysical Human-Robot InteractionRobot LearningShared ControlHuman Motor Control