WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Existing WiFi-based gesture recognition methods exhibit poor cross-domain generalization, suffering significant performance degradation in unseen environments. To address this, we propose a Doppler-spectrum-based cross-domain gesture recognition method: first, multi-angle Doppler time-frequency spectrograms are extracted from channel state information (CSI) and structured as temporal images; second, a ResNet18 backbone is enhanced with a multi-semantic spatial attention module and an adaptive channel attention module—both inspired by CBAM—to jointly model domain-invariant spatiotemporal features. Our approach effectively decouples environment-specific interference from intrinsic gesture dynamics, thereby substantially improving out-of-domain robustness. Evaluated on the Widar3 dataset, the method achieves 99.72% intra-domain accuracy and 97.61% cross-domain recognition accuracy, significantly surpassing current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract

While fulfilling communication tasks, wireless signals can also be used to sense the environment. Among various types of sensing media, WiFi signals offer advantages such as widespread availability, low hardware cost, and strong robustness to environmental conditions like light, temperature, and humidity. By analyzing Wi-Fi signals in the environment, it is possible to capture dynamic changes of the human body and accomplish sensing applications such as gesture recognition. Although many existing gesture sensing solutions perform well in-domain but lack cross-domain capabilities (i.e., recognition performance in untrained environments). To address this, we extract Doppler spectra from the channel state information (CSI) received by all receivers and concatenate each Doppler spectrum along the same time axis to generate fused images with multi-angle information as input features. Furthermore, inspired by the convolutional block attention module (CBAM), we propose a gesture recognition network that integrates a multi-semantic spatial attention mechanism with a self-attention-based channel mechanism. This network constructs attention maps to quantify the spatiotemporal features of gestures in images, enabling the extraction of key domain-independent features. Additionally, ResNet18 is employed as the backbone network to further capture deep-level features. To validate the network performance, we evaluate the proposed network on the public Widar3 dataset, and the results show that it not only maintains high in-domain accuracy of 99.72%, but also achieves high performance in cross-domain recognition of 97.61%, significantly outperforming existing best solutions.

Problem

Research questions and friction points this paper is trying to address.

Addresses cross-domain gesture recognition limitations in WiFi sensing.

Proposes attention mechanism network for domain-independent feature extraction.

Enhances accuracy in untrained environments using multi-angle Doppler spectra.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts Doppler spectra from CSI to create fused images

Integrates spatial and channel attention mechanisms for feature extraction

Uses ResNet18 backbone to capture deep-level gesture features

🔎 Similar Papers

CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese Network