🤖 AI Summary
O-RAN network slicing and resource management optimization suffer from a critical lack of realistic, reproducible 5G Key Performance Indicator (KPI) data for training intelligent RAN applications (xApps).
Method: We propose the first real-world-driven, end-to-end reproducible 5G traffic KPI generation paradigm: multi-scenario mobile traffic is collected from live Android/iOS devices and replayed with high fidelity in the srsRAN+Colosseum full-stack RF simulation platform—incorporating geospatial and mobility-aware channel modeling. This yields the first open-source, O-RAN-compliant 5G KPI dataset and integrated toolchain.
Contribution/Results: Leveraging this dataset, we design a lightweight CNN-based xApp for traffic-aware network slicing classification, achieving >95% offline accuracy and 92% online inference accuracy under real-time RIC deployment constraints. Our work substantially lowers the data barrier for O-RAN xApp development and establishes a reproducible, scalable ML training infrastructure for the RIC ecosystem.
📝 Abstract
5G and beyond cellular networks promise remarkable advancements in bandwidth, latency, and connectivity. The emergence of Open Radio Access Network (O-RAN) represents a pivotal direction for the evolution of cellular networks, inherently supporting machine learning (ML) for network operation control. Within this framework, RAN Intelligence Controllers (RICs) from one provider can employ ML models developed by third-party vendors through the acquisition of key performance indicators (KPIs) from geographically distant base stations or user equipment (UE). Yet, the development of ML models hinges on the availability of realistic and robust datasets. In this study, we embark on a two-fold journey. First, we collect a comprehensive 5G dataset, harnessing real-world cell phones across diverse applications, locations, and mobility scenarios. Next, we replicate this traffic within a full-stack srsRAN-based O-RAN framework on Colosseum, the world's largest radio frequency (RF) emulator. This process yields a robust and O-RAN compliant KPI dataset mirroring real-world conditions. We illustrate how such a dataset can fuel the training of ML models and facilitate the deployment of xApps for traffic slice classification by introducing a CNN based classifier that achieves accuracy > 95% offline and 92% online. To accelerate research in this domain, we provide open-source access to our toolchain and supplementary utilities, empowering the broader research community to expedite the creation of realistic and O-RAN compliant datasets.