🤖 AI Summary
Millimeter-wave (mmWave) radar point clouds are inherently sparse, severely limiting human mesh reconstruction accuracy. To address this, we propose a two-stage deep learning framework: (1) a spatio-temporal point cloud completion network to densify sparse mmWave point clouds; and (2) a 2D–3D motion feature cross-fusion module that injects 2D semantic priors from single-view image masks into the purely mmWave-based reconstruction pipeline, enabling end-to-end regression of SMPL parameters. Our method establishes the first privacy-preserving point cloud enhancement paradigm that requires only image mask supervision during training while performing inference entirely without visual input—thereby overcoming modality isolation. Evaluated on multiple benchmarks, it significantly outperforms state-of-the-art methods, reducing PA-MPJPE by 18.7% on average. Moreover, the enhanced point clouds are plug-and-play compatible with existing reconstruction models, consistently improving their performance.
📝 Abstract
Millimeter-wave (mmWave) radar offers robust sensing capabilities in diverse environments, making it a highly promising solution for human body reconstruction due to its privacy-friendly and non-intrusive nature. However, the significant sparsity of mmWave point clouds limits the estimation accuracy. To overcome this challenge, we propose a two-stage deep learning framework that enhances mmWave point clouds and improves human body reconstruction accuracy. Our method includes a mmWave point cloud enhancement module that densifies the raw data by leveraging temporal features and a multi-stage completion network, followed by a 2D-3D fusion module that extracts both 2D and 3D motion features to refine SMPL parameters. The mmWave point cloud enhancement module learns the detailed shape and posture information from 2D human masks in single-view images. However, image-based supervision is involved only during the training phase, and the inference relies solely on sparse point clouds to maintain privacy. Experiments on multiple datasets demonstrate that our approach outperforms state-of-the-art methods, with the enhanced point clouds further improving performance when integrated into existing models.