🤖 AI Summary
This work addresses the challenge of simultaneously preserving fidelity and multi-view consistency in extreme upscaling for 3D scene reconstruction from low-resolution inputs. To this end, the authors propose a progressive generative 3D Gaussian splatting framework that integrates geometrically consistent modeling with multi-scale semantic reasoning. Key components include a multi-view consistent super-resolution module, an extensible continuous level-of-detail structure, depth-guided feature warping, visual-language-model-driven detail synthesis, and a dynamic Gaussian visibility modulation mechanism. Experiments on the Mip-NeRF360 and Tanks & Temples datasets demonstrate that the proposed method significantly outperforms existing approaches in extreme magnification rendering, achieving notable advances in perceptual quality, cross-scale smoothness, and multi-view consistency.
📝 Abstract
We introduce GaussianZoom, a generative zoom-in 3D reconstruction system with an iterative progressive framework that combines geometry-consistent scene modeling and multi-scale semantic reasoning to enable high-fidelity extreme zoom-in rendering from low-resolution inputs. To achieve this, we develop a novel multi-view consistent super-resolution module with depth-based feature warping and VLM-driven detail synthesis, ensuring accurate multi-view correspondence while enriching fine-scale appearance beyond the observed resolution. To support zooming across large magnification ranges, we further introduce a new expandable continuous Level-of-Detail hierarchy that dynamically modulates Gaussian visibility for smooth, alias-free cross-scale rendering. Experiments on Mip-NeRF360 and Tanks\&Temples demonstrate that GaussianZoom achieves superior perceptual quality, multi-view consistency, and robustness under extreme magnification, establishing a strong baseline for generative zoom-in 3D scene reconstruction.