TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Urban Digital Twin (UDT) development has long been hindered by the absence of end-to-end benchmark datasets covering the full pipeline—from data acquisition and high-fidelity modeling to dynamic updating and downstream task validation. Existing datasets are typically limited to single modalities or isolated processing stages. To address this, we introduce the first large-scale, multimodal UDT benchmark: a 100,000 m² urban area featuring georeferenced and semantically aligned Level-of-Detail 3 (LoD3) 3D models, alongside 32 heterogeneous observation modalities—including ground-level, mobile, aerial, and satellite imagery—totaling 767 GB. This benchmark establishes the first unified indoor–outdoor georegistration, cross-modal semantic alignment, and LoD3 annotation framework. The dataset is publicly released and demonstrates significant improvements: +2.1 dB PSNR in NeRF novel-view synthesis and +8.3% IoU in building reconstruction. It further enables diverse downstream tasks—including solar potential analysis and point cloud segmentation—thereby filling a critical gap in standardized UDT evaluation.

Technology Category

Application Category

📝 Abstract

Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models' updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually limited to one part of the processing chain, hampering comprehensive UDTs validation. To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN. This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 $m^2$ and currently 767 GB of data. By ensuring georeferenced indoor-outdoor acquisition, high accuracy, and multimodal data integration, the benchmark supports robust analysis of sensors and the development of advanced reconstruction methods. Additionally, we explore downstream tasks demonstrating the potential of TUM2TWIN, including novel view synthesis of NeRF and Gaussian Splatting, solar potential analysis, point cloud semantic segmentation, and LoD3 building reconstruction. We are convinced this contribution lays a foundation for overcoming current limitations in UDT creation, fostering new research directions and practical solutions for smarter, data-driven urban environments. The project is available under: https://tum2t.win

Problem

Research questions and friction points this paper is trying to address.

Addressing challenges in creating comprehensive Urban Digital Twins (UDTs).

Providing a multimodal benchmark dataset for UDT validation.

Supporting advanced reconstruction methods and downstream task analysis.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive multimodal Urban Digital Twin dataset

Georeferenced indoor-outdoor high accuracy data

Supports advanced reconstruction and analysis tasks

🔎 Similar Papers

No similar papers found.