Generative Gaussian Splatting for Unbounded 3D City Generation

📅 2024-06-10
📈 Citations: 7
Influential: 1
📄 PDF

career value

238K/year
🤖 AI Summary
To address memory explosion and scale limitations in city-scale 3D scene generation, this paper proposes GaussianCity—a highly efficient single-pass feedforward framework. Methodologically, it builds upon 3D Gaussian Splatting and integrates bird’s-eye-view (BEV) representation, a Point Serializer architecture, and a lightweight decoding network. Its core contributions are: (1) the first BEV-Point compact intermediate representation, ensuring constant GPU memory consumption regardless of scene size; and (2) a spatially aware Gaussian attribute decoder that jointly models geometric structure and contextual semantics from BEV points. Evaluated under both UAV and street-level viewpoints, GaussianCity achieves state-of-the-art reconstruction quality, runs at 10.72 FPS—60× faster than CityDreamer—and drastically reduces GPU memory usage. Notably, it enables, for the first time, arbitrarily large-scale city modeling without memory bottlenecks.

Technology Category

Application Category

📝 Abstract
3D city generation with NeRF-based methods shows promising generation results but is computationally inefficient. Recently 3D Gaussian Splatting (3D-GS) has emerged as a highly efficient alternative for object-level 3D generation. However, adapting 3D-GS from finite-scale 3D objects and humans to infinite-scale 3D cities is non-trivial. Unbounded 3D city generation entails significant storage overhead (out-of-memory issues), arising from the need to expand points to billions, often demanding hundreds of Gigabytes of VRAM for a city scene spanning 10km^2. In this paper, we propose GaussianCity, a generative Gaussian Splatting framework dedicated to efficiently synthesizing unbounded 3D cities with a single feed-forward pass. Our key insights are two-fold: 1) Compact 3D Scene Representation: We introduce BEV-Point as a highly compact intermediate representation, ensuring that the growth in VRAM usage for unbounded scenes remains constant, thus enabling unbounded city generation. 2) Spatial-aware Gaussian Attribute Decoder: We present spatial-aware BEV-Point decoder to produce 3D Gaussian attributes, which leverages Point Serializer to integrate the structural and contextual characteristics of BEV points. Extensive experiments demonstrate that GaussianCity achieves state-of-the-art results in both drone-view and street-view 3D city generation. Notably, compared to CityDreamer, GaussianCity exhibits superior performance with a speedup of 60 times (10.72 FPS v.s. 0.18 FPS).
Problem

Research questions and friction points this paper is trying to address.

Efficiently generates unbounded 3D cities.
Reduces VRAM usage for large-scale scenes.
Improves speed in 3D city synthesis.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Gaussian Splatting framework
Compact BEV-Point scene representation
Spatial-aware Gaussian Attribute Decoder
🔎 Similar Papers
No similar papers found.