Wi-CBR: WiFi-based Cross-domain Behavior Recognition via Multimodal Collaborative Awareness

๐Ÿ“… 2025-06-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing WiFi-based human activity recognition methods predominantly rely on single-modal signals, limiting their capacity to model complex channel dynamics and exhibiting poor cross-domain generalization. To address this, we propose a novel dual-modal collaborative sensing framework thatโ€” for the first timeโ€”jointly leverages WiFi phase and Doppler shift signals to construct a human-activity-driven dynamic channel representation. Methodologically, we design a dual-branch self-attention architecture to capture intra-modal temporal dependencies and introduce a grouped gating attention mechanism to enable robust cross-modal feature fusion and information entropy optimization. Evaluated on Widar3.0 and XRF55 datasets, our approach achieves a 4.2% improvement in intra-domain accuracy over state-of-the-art methods and a 7.8% gain in cross-domain accuracy, significantly enhancing both precision and generalizability of contactless activity recognition.

Technology Category

Application Category

๐Ÿ“ Abstract
WiFi-based human behavior recognition aims to recognize gestures and activities by analyzing wireless signal variations. However, existing methods typically focus on a single type of data, neglecting the interaction and fusion of multiple features. To this end, we propose a novel multimodal collaborative awareness method. By leveraging phase data reflecting changes in dynamic path length and Doppler Shift (DFS) data corresponding to frequency changes related to the speed of gesture movement, we enable efficient interaction and fusion of these features to improve recognition accuracy. Specifically, we first introduce a dual-branch self-attention module to capture spatial-temporal cues within each modality. Then, a group attention mechanism is applied to the concatenated phase and DFS features to mine key group features critical for behavior recognition. Through a gating mechanism, the combined features are further divided into PD-strengthen and PD-weaken branches, optimizing information entropy and promoting cross-modal collaborative awareness. Extensive in-domain and cross-domain experiments on two large publicly available datasets, Widar3.0 and XRF55, demonstrate the superior performance of our method.
Problem

Research questions and friction points this paper is trying to address.

Improves WiFi-based gesture and activity recognition accuracy
Addresses neglect of multi-feature interaction in existing methods
Enables cross-modal collaboration via phase and DFS data fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal fusion of phase and DFS data
Dual-branch self-attention for spatial-temporal cues
Group attention with gating mechanism optimization
R
Ruobei Zhang
Hefei University of Technology
S
Shengeng Tang
Hefei University of Technology
Huan Yan
Huan Yan
Tsinghua University
Spatio-temporal data miningrecommender system
X
Xiang Zhang
Hefei University of Technology
Richang Hong
Richang Hong
Hefei University of Technology
MultimediaPattern Recognition