🤖 AI Summary
To address the challenge of real-time animation and rendering of high-fidelity 3D Gaussian full-body avatars on memory- and bandwidth-constrained mobile/VR devices, this paper proposes a lightweight knowledge distillation framework. The method innovatively distills neural pose correction knowledge into a shared linear layer and introduces a local Gaussian parameter sharing mechanism, significantly reducing model size and GPU memory footprint. Furthermore, it integrates a custom Vulkan-based rasterization pipeline with optimized Gaussian splatting. On the Meta Quest 3 platform, our approach achieves the first demonstration of synchronized real-time rendering of three Gaussian avatars at 72 FPS. Quantitative and qualitative evaluations confirm that the method preserves visual fidelity while substantially lowering computational overhead, enabling efficient on-device deployment.
📝 Abstract
Gaussian-based human avatars have achieved an unprecedented level of visual fidelity. However, existing approaches based on high-capacity neural networks typically require a desktop GPU to achieve real-time performance for a single avatar, and it remains non-trivial to animate and render such avatars on mobile devices including a standalone VR headset due to substantially limited memory and computational bandwidth. In this paper, we present SqueezeMe, a simple and highly effective framework to convert high-fidelity 3D Gaussian full-body avatars into a lightweight representation that supports both animation and rendering with mobile-grade compute. Our key observation is that the decoding of pose-dependent Gaussian attributes from a neural network creates non-negligible memory and computational overhead. Inspired by blendshapes and linear pose correctives widely used in Computer Graphics, we address this by distilling the pose correctives learned with neural networks into linear layers. Moreover, we further reduce the parameters by sharing the correctives among nearby Gaussians. Combining them with a custom splatting pipeline based on Vulkan, we achieve, for the first time, simultaneous animation and rendering of 3 Gaussian avatars in real-time (72 FPS) on a Meta Quest 3 VR headset.