🤖 AI Summary
Existing approaches lack a unified and scalable Gaussian representation capable of simultaneously supporting progressive quality and resolution reconstruction for both images and videos. This work proposes a hierarchical progressive 2D Gaussian splatting framework that enables coarse-to-fine scalable reconstruction through a base layer and multiple enhancement layers. A cross-layer joint training mechanism is introduced to concurrently optimize Gaussian parameters across all layers, ensuring inter-layer compatibility and reconstruction stability. To the best of our knowledge, this is the first method to achieve a unified hierarchical Gaussian representation for both images and videos. It significantly outperforms sequential layer-by-layer training strategies, yielding PSNR improvements of 2.6 dB for images and 1.9 dB for videos.
📝 Abstract
Gaussian splatting has emerged as a competitive explicit representation for image and video reconstruction. In this work, we present P-GSVC, the first layered progressive 2D Gaussian splatting framework that provides a unified solution for scalable Gaussian representation in both images and videos. P-GSVC organizes 2D Gaussian splats into a base layer and successive enhancement layers, enabling coarse-to-fine reconstructions. To effectively optimize this layered representation, we propose a joint training strategy that simultaneously updates Gaussians across layers, aligning their optimization trajectories to ensure inter-layer compatibility and a stable progressive reconstruction. P-GSVC supports scalability in terms of both quality and resolution. Our experiments show that the joint training strategy can gain up to 1.9 dB improvement in PSNR for video and 2.6 dB improvement in PSNR for image when compared to methods that perform sequential layer-wise training. Project page: https://longanwang-cs.github.io/PGSVC-webpage/