ReVo: A Cross-Layer Reliable Volumetric Videoconferencing System

📅 2026-04-30

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the challenges of voxel-based video conferencing, which suffers from visual and geometric distortions due to high bandwidth demands, stringent real-time requirements, and network packet loss. The paper presents the first cross-layer, modality-aware joint recovery system tailored for voxel video. At the transport layer, RGB and depth streams are decoupled, enabling selective protection of critical content. At the application layer, a neural restoration module reconstructs corrupted non-critical frames. Implemented atop WebRTC, the system is compatible with both conventional and neural codecs and integrates forward error correction (FEC) at the network layer. Real-world experiments under actual packet loss conditions demonstrate a 32% and 13% median improvement in SSIM for RGB and depth content, respectively, along with a 95.7% reduction in video stuttering.

📝 Abstract

Volumetric videoconferencing enables immersive six Degrees of Freedom interactions by jointly transmitting visual appearance and 3D geometry. However, delivering volumetric video over today's networks remains challenging due to high bandwidth demands, strict real-time latency constraints, and frequent packet loss. Packet loss not only degrades visual quality but also corrupts geometric structure, leading to severe artifacts and video freezes that significantly degrade Quality of Experience. Existing solutions either optimize volumetric videos assuming reliable networks or focus on loss recovery for 2D video, and are insufficient for volumetric videoconferencing. In this paper, we present ReVo, a loss-resilient volumetric videoconferencing system that jointly recovers RGB and depth content under packet loss while meeting real-time constraints on desktop-grade hardware. ReVo leverages the insight that effective recovery requires a cross-layer, modality-aware design. It decouples volumetric video into RGB and depth streams, selectively protects critical content using network-layer FEC, and reconstructs corrupted non-critical frames using a post-decode neural recovery module. ReVo is implemented end-to-end over WebRTC and supports both traditional and neural video codecs. Our evaluations using real-world loss traces show that ReVo improves median SSIM by up to 32% (resp. 13%) for RGB (resp. depth) content and reduces video freezes by up to 95.7% compared to existing techniques.

Problem

Research questions and friction points this paper is trying to address.

volumetric videoconferencing

packet loss

real-time constraints

Quality of Experience

3D geometry

Innovation

Methods, ideas, or system contributions that make the work stand out.

volumetric videoconferencing

cross-layer design

neural recovery