🤖 AI Summary
To address the challenge of real-time, photorealistic rendering of large-scale dynamic crowds, this paper proposes the first crowd animation rendering framework based on 3D Gaussian Splatting (3DGS). The method operates in two stages: (i) decoupled reconstruction of pose- and appearance-controllable Gaussian representations for individuals with diverse poses and clothing from monocular video; and (ii) real-time synthesis of crowds exceeding one thousand agents via level-of-detail (LoD) scheduling, GPU memory-aware optimization, and neural implicit animation representation. Key contributions include: (1) the first integration of 3DGS into crowd rendering; (2) a pose-appearance disentangled, animatable Gaussian representation; and (3) a real-time LoD strategy jointly optimizing memory footprint, visual quality, and frame rate. Experiments demonstrate superior performance over point-cloud and NeRF baselines in PSNR and SSIM, with 42% reduced VRAM consumption and sustained >60 FPS rendering—achieving both high fidelity and practical deployability.
📝 Abstract
We present CrowdSplat, a novel approach that leverages 3D Gaussian Splatting for real-time, high-quality crowd rendering. Our method utilizes 3D Gaussian functions to represent animated human characters in diverse poses and outfits, which are extracted from monocular videos. We integrate Level of Detail (LoD) rendering to optimize computational efficiency and quality. The CrowdSplat framework consists of two stages: (1) avatar reconstruction and (2) crowd synthesis. The framework is also optimized for GPU memory usage to enhance scalability. Quantitative and qualitative evaluations show that CrowdSplat achieves good levels of rendering quality, memory efficiency, and computational performance. Through these experiments, we demonstrate that CrowdSplat is a viable solution for dynamic, realistic crowd simulation in real-time applications.