WiFlow: A Lightweight WiFi-based Continuous Human Pose Estimation Network with Spatio-Temporal Feature Decoupling

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the challenges of modeling continuous human motions and high computational costs in WiFi-based pose estimation by proposing WiFlow, a lightweight framework. WiFlow introduces, for the first time in WiFi-based pose estimation, a spatiotemporal feature decoupling strategy combined with axial attention mechanisms. Its encoder-decoder architecture integrates temporal convolutions and asymmetric convolutions to effectively extract spatiotemporal features from channel state information (CSI) signals and capture structural dependencies among keypoints. Evaluated on a newly curated dataset comprising 360,000 samples, WiFlow achieves a PCK@20 of 97.00% and PCK@50 of 99.48% with only 4.82 million parameters, while attaining an average joint error as low as 0.008 meters, demonstrating an exceptional balance between accuracy and efficiency.

Technology Category

Application Category

📝 Abstract

Human pose estimation is fundamental to intelligent perception in the Internet of Things (IoT), enabling applications ranging from smart healthcare to human-computer interaction. While WiFi-based methods have gained traction, they often struggle with continuous motion and high computational overhead. This work presents WiFlow, a novel framework for continuous human pose estimation using WiFi signals. Unlike vision-based approaches such as two-dimensional deep residual networks that treat Channel State Information (CSI) as images, WiFlow employs an encoder-decoder architecture. The encoder captures spatio-temporal features of CSI using temporal and asymmetric convolutions, preserving the original sequential structure of signals. It then refines keypoint features of human bodies to be tracked and capture their structural dependencies via axial attention. The decoder subsequently maps the encoded high-dimensional features into keypoint coordinates. Trained on a self-collected dataset of 360,000 synchronized CSI-pose samples from 5 subjects performing continuous sequences of 8 daily activities, WiFlow achieves a Percentage of Correct Keypoints (PCK) of 97.00% at a threshold of 20% (PCK@20) and 99.48% at PCK@50, with a mean per-joint position error of 0.008m. With only 4.82M parameters, WiFlow significantly reduces model complexity and computational cost, establishing a new performance baseline for practical WiFi-based human pose estimation. Our code and datasets are available at https://github.com/DY2434/WiFlow-WiFi-Pose-Estimation-with-Spatio-Temporal-Decoupling.git.

Problem

Research questions and friction points this paper is trying to address.

WiFi-based human pose estimation

continuous motion

computational overhead

spatio-temporal features

Channel State Information

Innovation

Methods, ideas, or system contributions that make the work stand out.

WiFi-based pose estimation

spatio-temporal feature decoupling

axial attention