🤖 AI Summary
Mapping Wi-Fi channel state information (CSI) to high-fidelity depth images remains challenging due to the ill-posed, under-constrained nature of the inverse problem and the lack of explicit physical interpretability. Method: This paper proposes a physics-driven disentangled reconstruction framework: (i) it decomposes depth images into three physically grounded components—shape, depth, and spatial position—and establishes explicit analytical mappings between each component and raw CSI-derived physical features (angle-of-arrival, time-of-flight, and Doppler shift); (ii) it introduces a VAE-based teacher–student architecture integrating time-frequency feature extraction, multi-task auxiliary learning, and physics-informed regularization. Contribution/Results: Evaluated on real-world deployments, the method achieves centimeter-level depth accuracy and superior structural fidelity, improving PSNR by 4.2 dB over state-of-the-art methods. It enables robust non-line-of-sight (NLoS), low-power, contactless, and real-time human sensing—making it suitable for security monitoring and elderly care applications.
📝 Abstract
This study proposes a new deep learning method for reconstructing depth images of moving objects within a specific area using Wi-Fi channel state information (CSI). The Wi-Fi-based depth imaging technique has novel applications in domains such as security and elder care. However, reconstructing depth images from CSI is challenging because learning the mapping function between CSI and depth images, both of which are high-dimensional data, is particularly difficult. To address the challenge, we propose a new approach called Wi-Depth. The main idea behind the design of Wi-Depth is that a depth image of a moving object can be decomposed into three core components: the shape, depth, and position of the target. Therefore, in the depth-image reconstruction task, Wi-Depth simultaneously estimates the three core pieces of information as auxiliary tasks in our proposed VAE-based teacher-student architecture, enabling it to output images with the consistency of a correct shape, depth, and position. In addition, the design of Wi-Depth is based on our idea that this decomposition efficiently takes advantage of the fact that shape, depth, and position relate to primitive information inferred from CSI such as angle-of-arrival, time-of-flight, and Doppler frequency shift.