π€ AI Summary
This study addresses the issue of training/test data leakage caused by spatial autocorrelation in 5G/6G network planning. To mitigate neighborhood data leakage and enhance spatial generalization, the authors propose a two-stage context-aware partitioning strategy that integrates context clustering based on heterogeneous geographic and socioeconomic features with a residual spatial error correction mechanism. The approach is validated on real-world crowdsourced cellular traffic data from five major Canadian cities. Compared to conventional location-based clustering methods, it significantly reduces mean absolute error (MAE) and improves the robustness of fine-grained cellular traffic prediction, thereby offering reliable support for precise bandwidth allocation and spectrum planning.
π Abstract
Accurate spatial prediction of cellular traffic demand is essential for 5G NR capacity planning, network densification, and data-driven 6G planning. Although machine learning can fuse heterogeneous geospatial and socio-economic layers to estimate fine-grained demand maps, spatial autocorrelation can cause neighborhood leakage under naive train/test splits, inflating accuracy and weakening planning reliability. This paper presents an AI-driven framework that reduces leakage and improves spatial generalization via a context-aware two-stage splitting strategy with residual spatial error correction. Experiments using crowdsourced usage indicators across five major Canadian cities show consistent mean absolute error (MAE) reductions relative to location-only clustering, supporting more reliable bandwidth provisioning and evidence-based spectrum planning and sharing assessments.