🤖 AI Summary
To address the fundamental limitations of pure IMU-based methods—namely, their inability to recover global translation and inter-personal relative positions—this paper proposes a multi-person full-body motion tracking framework that fuses sparse wearable IMUs with ultra-wideband (UWB) ranging measurements. Methodologically, we formulate a structured state-space model wherein inter-subject UWB distance measurements serve as geometric constraints; we further introduce a two-stage optimization strategy that explicitly leverages these constraints for the first time to improve global trajectory accuracy. Our contributions are threefold: (1) We release GIP-DB, the first publicly available dual-person IMU+UWB motion dataset; (2) Our method achieves state-of-the-art performance in both synthetic and real-world scenarios, reducing global translation error by 32% and inter-personal relative pose error by 27% over existing approaches; (3) We empirically validate the feasibility and robustness of a lightweight, markerless, non-line-of-sight IMU+UWB solution for outdoor multi-person pose capture.
📝 Abstract
Tracking human full-body motion using sparse wearable inertial measurement units (IMUs) overcomes the limitations of occlusion and instrumentation of the environment inherent in vision-based approaches. However, purely IMU-based tracking compromises translation estimates and accurate relative positioning between individuals, as inertial cues are inherently self-referential and provide no direct spatial reference for others. In this paper, we present a novel approach for robustly estimating body poses and global translation for multiple individuals by leveraging the distances between sparse wearable sensors - both on each individual and across multiple individuals. Our method Group Inertial Poser estimates these absolute distances between pairs of sensors from ultra-wideband ranging (UWB) and fuses them with inertial observations as input into structured state-space models to integrate temporal motion patterns for precise 3D pose estimation. Our novel two-step optimization further leverages the estimated distances for accurately tracking people's global trajectories through the world. We also introduce GIP-DB, the first IMU+UWB dataset for two-person tracking, which comprises 200 minutes of motion recordings from 14 participants. In our evaluation, Group Inertial Poser outperforms previous state-of-the-art methods in accuracy and robustness across synthetic and real-world data, showing the promise of IMU+UWB-based multi-human motion capture in the wild. Code, models, dataset: https://github.com/eth-siplab/GroupInertialPoser