🤖 AI Summary
To address safety and precision challenges in bimanual teleoperation within cluttered environments—stemming from limited spatial awareness—this paper proposes a novel safe teleoperation system integrating immersive VR control with speech-driven obstacle avoidance. Methodologically, it pioneers the fusion of speech-guided visual grounding (i.e., speech-triggered target localization and instance segmentation) with real-time dynamic obstacle reconstruction via 3D mesh generation, embedded within a model-based whole-body motion controller to enable natural, low-cognitive-load collision avoidance. The system is implemented on the HTC Vive VR platform and integrates speech recognition, visual SLAM, real-time semantic segmentation, and 3D reconstruction. Experimental evaluation in static complex scenes demonstrates a 62% reduction in collision rate while maintaining a task completion rate exceeding 94%, thereby validating the system’s effectiveness and practicality.
📝 Abstract
Teleoperating precise bimanual manipulations in cluttered environments is challenging for operators, who often struggle with limited spatial perception and difficulty estimating distances between target objects, the robot's body, obstacles, and the surrounding environment. To address these challenges, local robot perception and control should assist the operator during teleoperation. In this work, we introduce a safe teleoperation system that enhances operator control by preventing collisions in cluttered environments through the combination of immersive VR control and voice-activated collision avoidance. Using HTC Vive controllers, operators directly control a bimanual mobile manipulator, while spoken commands such as "avoid the yellow tool" trigger visual grounding and segmentation to build 3D obstacle meshes. These meshes are integrated into a whole-body controller to actively prevent collisions during teleoperation. Experiments in static, cluttered scenes demonstrate that our system significantly improves operational safety without compromising task efficiency.