🤖 AI Summary
To address the trade-off among narrow field-of-view (FoV) of onboard depth sensors, high bandwidth consumption, and low rendering fidelity in mobile robot teleoperation, this paper proposes a novel paradigm integrating on-board depth sensing with 3D Gaussian Splatting (3DGS)-based environmental modeling. We pioneer the use of 3DGS as a geometric-semantic extension medium for sparse depth data, enabling wide-FoV, high-fidelity, high-frame-rate streaming rendering. A multi-source pose alignment and voxel-level深度融合 algorithm is designed to establish an end-to-end low-latency closed-loop control pipeline. Furthermore, we develop a lightweight VR interaction framework and an embedded telepresence hardware platform. A user study (n=24) demonstrates a 37% improvement in operational efficiency, a 42% increase in situational awareness scores, and 96% user preference for our system. All code, pre-trained models, and hardware schematics are fully open-sourced.
📝 Abstract
We introduce Reality Fusion, a novel robot teleoperation system that localizes, streams, projects, and merges a typical onboard depth sensor with a photorealistic, high resolution, high framerate, and wide FoV rendering of the complex remote environment represented as 3D Gaussian splats (3DGS). Our framework enables robust egocentric and exocentric robot teleoperation in immersive VR, with the 3DGS effectively extending spatial information of a depth sensor with limited FoV and balancing the trade-off between data streaming costs and data visual quality. We evaluated our framework through a user study with 24 participants, which revealed that Reality Fusion leads to significantly better user performance, situation awareness, and user preferences. To support further research and development, we provide an open-source implementation with an easy-to-replicate custom-made telepresence robot, a high-performance virtual reality 3DGS renderer, and an immersive robot control package.1