🤖 AI Summary
This work addresses the pervasive challenges of site-level feature missingness and scarce labeled data in multi-site WiFi channel state information (CSI) sensing. To jointly tackle structured missing patterns and label scarcity, the authors propose a unified modeling approach that explicitly incorporates site unavailability into both representation learning and downstream task training. The method integrates an enhanced cross-modal self-supervised learning (CroSSL) framework to learn representations robust to missing data and introduces a site-level masking augmentation (SMA) mechanism to improve generalization. Experimental results demonstrate that the proposed approach significantly outperforms existing single-strategy methods in scenarios where feature missingness and label scarcity coexist, thereby enhancing the robustness and practicality of CSI-based sensing systems.
📝 Abstract
We propose a WiFi Channel State Information (CSI) sensing framework for multi-station deployments that addresses two fundamental challenges in practical CSI sensing: station-wise feature missingness and limited labeled data. Feature missingness is commonly handled by resampling unevenly spaced CSI measurements or by reconstructing missing samples, while label scarcity is mitigated by data augmentation or self-supervised representation learning. However, these techniques are typically developed in isolation and do not jointly address long-term, structured station unavailability together with label scarcity. To bridge this gap, we explicitly incorporate station unavailability into both representation learning and downstream model training. Specifically, we adapt cross-modal self-supervised learning (CroSSL), a representation learning framework originally designed for time-series sensory data, to multi-station CSI sensing in order to learn representations that are inherently invariant to station-wise feature missingness from unlabeled data. Furthermore, we introduce Station-wise Masking Augmentation (SMA) during downstream model training, which exposes the model to realistic station unavailability patterns under limited labeled data. Our experiments show that neither missingness-invariant pre-training nor station-wise augmentation alone is sufficient; their combination is essential to achieve robust performance under both station-wise feature missingness and label scarcity. The proposed framework provides a practical and robust foundation for multi-station WiFi CSI sensing in real-world deployments.