PoseGaussian: Pose-Driven Novel View Synthesis for Robust 3D Human Reconstruction

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the degradation in 3D reconstruction and novel view synthesis quality caused by articulated motion and severe self-occlusions in dynamic human scenes. To this end, we propose a pose-driven Gaussian splatting framework that uniquely leverages human pose as a structural prior for both geometry modeling and temporal consistency. Our approach integrates a pose encoder with a joint color-depth network to establish an end-to-end differentiable pipeline enabling real-time rendering. In contrast to prior methods that treat pose merely as a conditioning signal or deformation guide, our formulation significantly enhances reconstruction robustness and generalization. The method achieves state-of-the-art performance on ZJU-MoCap and THuman2.0 datasets, reporting PSNR of 30.86, SSIM of 0.979, and LPIPS of 0.028, while supporting real-time rendering at 100 FPS.

Technology Category

Application Category

📝 Abstract

We propose PoseGaussian, a pose-guided Gaussian Splatting framework for high-fidelity human novel view synthesis. Human body pose serves a dual purpose in our design: as a structural prior, it is fused with a color encoder to refine depth estimation; as a temporal cue, it is processed by a dedicated pose encoder to enhance temporal consistency across frames. These components are integrated into a fully differentiable, end-to-end trainable pipeline. Unlike prior works that use pose only as a condition or for warping, PoseGaussian embeds pose signals into both geometric and temporal stages to improve robustness and generalization. It is specifically designed to address challenges inherent in dynamic human scenes, such as articulated motion and severe self-occlusion. Notably, our framework achieves real-time rendering at 100 FPS, maintaining the efficiency of standard Gaussian Splatting pipelines. We validate our approach on ZJU-MoCap, THuman2.0, and in-house datasets, demonstrating state-of-the-art performance in perceptual quality and structural accuracy (PSNR 30.86, SSIM 0.979, LPIPS 0.028).

Problem

Research questions and friction points this paper is trying to address.

novel view synthesis

3D human reconstruction

articulated motion

self-occlusion

dynamic human scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pose-Driven

Gaussian Splatting

Novel View Synthesis