A Multi-View 3D Telepresence System for XR Robot Teleoperation

📅 2026-04-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional screen-based interfaces lack effective depth cues, limiting the intuitiveness and efficiency of robotic teleoperation. This work proposes a multi-view telepresence system tailored for standalone VR headsets (Meta Quest 3), which, for the first time, fuses geometric data from three synchronized cameras to generate GPU-accelerated point clouds in real time, while integrating wrist-mounted RGB video streams to augment local high-resolution texture details. The system renders immersive 3D scenes comprising approximately 75,000 points by synergistically combining global 3D structure with fine-grained visual cues. In a controlled user study with 31 participants, the proposed approach significantly outperformed baseline methods—including RGB-only, point-cloud-only, and OpenTeleVision—achieving superior performance across task success rate, completion time, subjective workload, and system usability.
📝 Abstract
Robot teleoperation is critical for applications such as remote maintenance, fleet robotics, search and rescue, and data collection for robot learning. Effective teleoperation requires intuitive 3D visualization with reliable depth cues, which conventional screen-based interfaces often fail to provide. We introduce a multi-view VR telepresence system that (1) fuses geometry from three cameras to produce GPU-accelerated point-cloud rendering on standalone VR hardware, and (2) integrates a wrist-mounted RGB stream to provide high-resolution local detail where point-cloud accuracy is limited. Our pipeline supports real-time rendering of approximately 75k points on the Meta Quest 3. A within-subject study was conducted with 31 participants to compare our system to other visualisation modalities, such as RGB streams, a projection of stereo-vision directly in the VR device and point clouds without providing additional RGB information. Across three different teleoperated manipulation tasks, we measured task success, completion time, perceived workload, and usability. Our system achieved the best overall performance, while the Point Cloud modality without RGB also outperforming the RGB streams and OpenTeleVision. These results show that combining global 3D structure with localized high-resolution detail substantially improves telepresence for manipulation and provides a strong foundation for next-generation robot teleoperation systems.
Problem

Research questions and friction points this paper is trying to address.

robot teleoperation
3D visualization
depth perception
telepresence
XR
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-view 3D telepresence
point-cloud rendering
VR teleoperation
RGB-depth fusion
standalone VR
🔎 Similar Papers
No similar papers found.
E
Enes Ulas Dincer
Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, 76137 Karlsruhe, Germany
M
Manuel Zaremski
Institute of Human and Industrial Engineering, Karlsruhe Institute of Technology, 76137 Karlsruhe, Germany
A
Alexandra Nick
Institute of Human and Industrial Engineering, Karlsruhe Institute of Technology, 76137 Karlsruhe, Germany
E
Elias Wucher
Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, 76137 Karlsruhe, Germany
B
Barbara Deml
Institute of Human and Industrial Engineering, Karlsruhe Institute of Technology, 76137 Karlsruhe, Germany
Gerhard Neumann
Gerhard Neumann
Professor, Karlsruhe Institute of Technology (KIT)
RoboticsMachine Learning