🤖 AI Summary
This study addresses the challenges of traditional outdoor-to-indoor (O2I) and indoor-to-indoor (I2I) radio frequency building loss modeling, which relies on costly and noisy field measurements and suffers from data imbalance. The work proposes a novel semi-supervised learning (SSL) approach that integrates passively collected user equipment data from 3GPP-compliant networks with publicly available building information. A hybrid learning framework is developed, combining Random Forest, XGBoost, LightGBM, and a voting mechanism. Under identical data constraints, the proposed SSL method achieves a relative improvement of 12.6% in O2I classification accuracy and 3.4% in I2I accuracy compared to purely supervised baselines, while reducing prediction entropy by up to 8.4%. Among the variants, SSL XGBoost yields the highest confidence in O2I tasks, whereas SSL LightGBM performs best for I2I scenarios.
📝 Abstract
Accurate modeling of outdoor-to-indoor (O2I) and indoor-to-indoor (I2I) signal loss is important for improving indoor wireless network performance in dense urban areas. Traditional on-site measurements are expensive, time-consuming, and difficult to conduct across wide regions. Real-world datasets also tend to be noisy and imbalanced, which makes signal loss prediction challenging. This study presents a machine learning framework for classifying radio frequency (RF) building loss. The framework combines passively collected, crowdsourced user equipment (UE) data from 3GPP-compliant networks with public building information. We evaluated Random Forest, XGBoost, LightGBM, and a voting classifier using both supervised (SL) and semi-supervised learning (SSL). Compared to SL-only inference, the proposed SL and SSL framework improved both prediction accuracy and confidence under identical data constraints, achieving up to 12.6% relative accuracy gain for O2I loss and 3.4% for I2I loss, while reducing prediction entropy by up to 8.4%. Among the evaluated models, SSL XGBoost provided the most confident O2I loss classification, whereas SSL LightGBM achieved the best performance for I2I loss. These results demonstrate that the proposed approach provides a practical, data-driven alternative to traditional models, with promising potential to support better network planning and indoor coverage optimization.