CaFe-TeleVision: A Coarse-to-Fine Teleoperation System with Immersive Situated Visualization for Enhanced Ergonomics

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low operational efficiency and poor ergonomics of existing teleoperation systems in complex scenarios, this paper proposes a coarse-to-fine teleoperation framework tailored for humanoid collaborative robots. The method integrates kinematic remapping, multi-view adaptive rendering, real-time bimanual pose estimation, and an ergonomic assessment framework. Key contributions include: (1) a novel two-level motion retargeting mechanism enabling workspace-adaptive mapping; and (2) a vision-based cognitive load modeling approach for on-demand contextual visualization, balancing immersion with sub-100-ms feedback latency. Evaluated across six bimanual manipulation tasks, the system achieves a 28.89% improvement in task success rate and a 26.81% reduction in completion time. A user study (n=24) confirms statistically significant reductions in subjective workload, alongside marked improvements in operational comfort and system acceptability.

Technology Category

Application Category

📝 Abstract
Teleoperation presents a promising paradigm for remote control and robot proprioceptive data collection. Despite recent progress, current teleoperation systems still suffer from limitations in efficiency and ergonomics, particularly in challenging scenarios. In this paper, we propose CaFe-TeleVision, a coarse-to-fine teleoperation system with immersive situated visualization for enhanced ergonomics. At its core, a coarse-to-fine control mechanism is proposed in the retargeting module to bridge workspace disparities, jointly optimizing efficiency and physical ergonomics. To stream immersive feedback with adequate visual cues for human vision systems, an on-demand situated visualization technique is integrated in the perception module, which reduces the cognitive load for multi-view processing. The system is built on a humanoid collaborative robot and validated with six challenging bimanual manipulation tasks. User study among 24 participants confirms that CaFe-TeleVision enhances ergonomics with statistical significance, indicating a lower task load and a higher user acceptance during teleoperation. Quantitative results also validate the superior performance of our system across six tasks, surpassing comparative methods by up to 28.89% in success rate and accelerating by 26.81% in completion time. Project webpage: https://clover-cuhk.github.io/cafe_television/
Problem

Research questions and friction points this paper is trying to address.

Bridging workspace disparities for efficiency and ergonomics
Reducing cognitive load in multi-view teleoperation processing
Enhancing user ergonomics and task performance in teleoperation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Coarse-to-fine control mechanism for workspace disparities
On-demand situated visualization reduces cognitive load
System validated on humanoid robot with bimanual tasks
🔎 Similar Papers
No similar papers found.
Z
Zixin Tang
Department of Mechanical and Automation Engineering, T-Stone Robotics Institute, The Chinese University of Hong Kong, Hong Kong SAR
Y
Yiming Chen
Department of Mechanical and Automation Engineering, T-Stone Robotics Institute, The Chinese University of Hong Kong, Hong Kong SAR
Quentin Rouxel
Quentin Rouxel
CUHK
RoboticHumanoid RobotsMulti-ContactWhole-Body ControlImitation Learning
Dianxi Li
Dianxi Li
The Chinese University of Hongkong
LLMRobotic ManipulationRoboticsForce ControlRobotic Grasping
S
Shuang Wu
Huawei Hong Kong Research Center
F
Fei Chen
Department of Mechanical and Automation Engineering, T-Stone Robotics Institute, The Chinese University of Hong Kong, Hong Kong SAR