CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities

📅 2025-01-15

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

To address challenges in unbounded 4D urban generation—including difficulty modeling dynamic objects, sensitivity to spatial distortion, and low fidelity in multi-scale structural representation—this paper proposes the first motion-static decoupled 4D neural urban generation framework. Methodologically: (1) it introduces a novel paradigm separating static scenes (buildings, backgrounds) from dynamic traffic entities (vehicles); (2) it designs a dual-path neural field (stuff/instance), integrating customized hash grids with periodic positional encoding; and (3) it constructs the first multi-source urban dataset covering layout, aerial imagery, and 3D instances, synthesized from OSM, Google Earth, and CityTopia. Experiments demonstrate state-of-the-art performance in structural plausibility and visual realism. The framework enables downstream tasks including instance editing, style transfer, and urban simulation, significantly improving editability, scalability, and spatiotemporal consistency.

Technology Category

Application Category

📝 Abstract

3D scene generation has garnered growing attention in recent years and has made significant progress. Generating 4D cities is more challenging than 3D scenes due to the presence of structurally complex, visually diverse objects like buildings and vehicles, and heightened human sensitivity to distortions in urban environments. To tackle these issues, we propose CityDreamer4D, a compositional generative model specifically tailored for generating unbounded 4D cities. Our main insights are 1) 4D city generation should separate dynamic objects (e.g., vehicles) from static scenes (e.g., buildings and roads), and 2) all objects in the 4D scene should be composed of different types of neural fields for buildings, vehicles, and background stuff. Specifically, we propose Traffic Scenario Generator and Unbounded Layout Generator to produce dynamic traffic scenarios and static city layouts using a highly compact BEV representation. Objects in 4D cities are generated by combining stuff-oriented and instance-oriented neural fields for background stuff, buildings, and vehicles. To suit the distinct characteristics of background stuff and instances, the neural fields employ customized generative hash grids and periodic positional embeddings as scene parameterizations. Furthermore, we offer a comprehensive suite of datasets for city generation, including OSM, GoogleEarth, and CityTopia. The OSM dataset provides a variety of real-world city layouts, while the Google Earth and CityTopia datasets deliver large-scale, high-quality city imagery complete with 3D instance annotations. Leveraging its compositional design, CityDreamer4D supports a range of downstream applications, such as instance editing, city stylization, and urban simulation, while delivering state-of-the-art performance in generating realistic 4D cities.

Problem

Research questions and friction points this paper is trying to address.

Four-Dimensional City Generation

Complex Object Handling

Spatial Distortion Sensitivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Fields

Dynamic-Static Element Separation

4D City Generation

🔎 Similar Papers

Generative Gaussian Splatting for Unbounded 3D City Generation