🤖 AI Summary
This study addresses the critical limitation in mobile networking research—the scarcity of large-scale, real-world datasets—which has hindered the application of machine learning to wireless network analysis and optimization. To bridge this gap, the authors collect and integrate geotagged multi-source 4G LTE and 5G NR measurements across the city of Vienna, combining passive observations from wideband scanners with active logs from commercial user equipment. This dual-perspective approach synergistically complements network-side and user-side views, and all data are spatiotemporally aligned with high-fidelity 3D building and terrain models. The work presents the first publicly released, city-scale, multi-modal, and well-structured open dataset, featuring inferred base station parameters and a detailed 3D urban model, thereby enabling environment-aware modeling, ray-tracing calibration, and reproducible AI-driven network optimization research.
📝 Abstract
Machine learning for mobile network analysis, planning, and optimization is often limited by the lack of large, comprehensive real-world datasets. This paper introduces the Vienna 4G/5G Drive-Test Dataset, a city-scale open dataset of georeferenced Long Term Evolution (LTE) and 5G New Radio (NR) measurements collected across Vienna, Austria. The dataset combines passive wideband scanner observations with active handset logs, providing complementary network-side and user-side views of deployed radio access networks. The measurements cover diverse urban and suburban settings and are aligned with time and location information to support consistent evaluation. For a representative subset of base stations (BSs), we provide inferred deployment descriptors, including estimated BS locations, sector azimuths, and antenna heights. The release further includes high-resolution building and terrain models, enabling geometry-conditioned learning and calibration of deterministic approaches such as ray tracing. To facilitate practical reuse, the data are organized into scanner, handset, estimated cell information, and city-model components, and the accompanying documentation describes the available fields and intended joins between them. The dataset enables reproducible benchmarking across environment-aware learning, propagation modeling, coverage analysis, and ray-tracing calibration workflows.