CSI-JEPA: Towards Foundation Representations for Ubiquitous Sensing with Minimal Supervision

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenge of deploying channel state information (CSI)-based sensing models in label-scarce scenarios, where existing supervised approaches struggle due to their heavy reliance on annotated data. To overcome this limitation, the authors propose a self-supervised predictive representation learning framework that leverages unlabeled CSI data by introducing a spatiotemporal–subcarrier tokenization strategy aligned with CSI’s physical structure and a channel-variation-aware masking mechanism. This approach enables the learning of transferable time–frequency spectral representations and establishes, for the first time, a general-purpose CSI sensing paradigm based on a frozen backbone paired with lightweight task-specific adapters. Evaluated across seven real-world Wi-Fi sensing tasks, the method achieves an average accuracy improvement of 10.64 percentage points over state-of-the-art supervised models while reducing annotation costs by up to 98.0%.

📝 Abstract

Channel state information (CSI) provides a widely available sensing modality for human and environment perception, but existing CSI sensing models usually rely on task-specific supervised training and require substantial labeled data for each task, device, user, or environment. This limits their scalability in practical deployments where unlabeled CSI is abundant but labeled data is costly to collect. In this paper, we present CSI-JEPA, a self-supervised predictive representation learning framework for label-efficient, multi-task Wi-Fi sensing. CSI-JEPA learns reusable temporal-spectral representations from unlabeled CSI samples by predicting latent features of masked channel regions from visible context. To better match the physical structure of CSI, CSI-JEPA tokenizes channel-response amplitude windows along the time and subcarrier dimensions. It then introduces a channel variation-aware masking strategy that samples predictive targets from regions with stronger local temporal and subcarrier-domain variations. After pretraining, the encoder is frozen and used as a backbone, with lightweight task-specific adapters added for downstream sensing tasks. We evaluate CSI-JEPA on seven real-world Wi-Fi sensing tasks spanning diverse objectives and deployment settings. The results show that CSI-JEPA improves downstream sensing performance over competitive baselines, achieving up to 10.64 percentage points mean accuracy gain over state-of-the-art supervised Transformer and matched-budget label savings of up to 98.0%.

Problem

Research questions and friction points this paper is trying to address.

CSI sensing

supervised learning

label efficiency

ubiquitous sensing

scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning

CSI representation

predictive modeling