GaussianGAN: Real-Time Photorealistic controllable Human Avatars

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Current controllable photorealistic portrait generation methods suffer from pervasive rendering blur, limiting visual fidelity and practical applicability. To address this, we propose a neural-rendering-based framework for real-time, high-fidelity digital human avatar synthesis. Our approach introduces a novel cylindrical surface Gaussian point cloud representation, enhanced with skeletal priors to achieve geometric densification. We further design a novel view-aware segmentation module to improve cross-view semantic consistency. The end-to-end architecture integrates Gaussian splatting rendering, a UNet-based generator, camera calibration, and multimodal feature fusion. Evaluated on ZJU Mocap and THuman4 benchmarks, our method achieves PSNR scores of 32.94 dB and 33.39 dB, respectively, while sustaining a rendering speed of 79 FPS—significantly outperforming state-of-the-art approaches in both quality and efficiency.

Technology Category

Application Category

📝 Abstract

Photorealistic and controllable human avatars have gained popularity in the research community thanks to rapid advances in neural rendering, providing fast and realistic synthesis tools. However, a limitation of current solutions is the presence of noticeable blurring. To solve this problem, we propose GaussianGAN, an animatable avatar approach developed for photorealistic rendering of people in real-time. We introduce a novel Gaussian splatting densification strategy to build Gaussian points from the surface of cylindrical structures around estimated skeletal limbs. Given the camera calibration, we render an accurate semantic segmentation with our novel view segmentation module. Finally, a UNet generator uses the rendered Gaussian splatting features and the segmentation maps to create photorealistic digital avatars. Our method runs in real-time with a rendering speed of 79 FPS. It outperforms previous methods regarding visual perception and quality, achieving a state-of-the-art results in terms of a pixel fidelity of 32.94db on the ZJU Mocap dataset and 33.39db on the Thuman4 dataset.

Problem

Research questions and friction points this paper is trying to address.

Addresses blurring in photorealistic human avatar rendering

Develops real-time animatable avatars using Gaussian splatting

Improves visual quality through semantic segmentation integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian splatting densification strategy for surface points

Novel view segmentation module for semantic rendering

UNet generator combining splatting features and segmentation

🔎 Similar Papers

No similar papers found.