MuPPet: Multi-person 2D-to-3D Pose Lifting

📅 2026-04-08

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing 2D-to-3D multi-person pose lifting methods often neglect inter-person relationships, struggling to handle variable numbers of people and occlusion scenarios. To address this limitation, this work proposes MuPPet, a novel framework that explicitly models interpersonal interactions for the first time in pose lifting. MuPPet integrates person encoding, permutation augmentation, and a dynamic multi-person attention mechanism to construct an end-to-end scalable model for 3D multi-person pose estimation. The method accommodates arbitrary numbers of input individuals and significantly outperforms current state-of-the-art single- and multi-person approaches across multiple datasets featuring complex group interactions. These results underscore the critical role of modeling interpersonal associations in enhancing both the accuracy and robustness of 3D pose estimation.

Technology Category

Application Category

📝 Abstract

Multi-person social interactions are inherently built on coherence and relationships among all individuals within the group, making multi-person localization and body pose estimation essential to understanding these social dynamics. One promising approach is 2D-to-3D pose lifting which provides a 3D human pose consisting of rich spatial details by building on the significant advances in 2D pose estimation. However, the existing 2D-to-3D pose lifting methods often neglect inter-person relationships or cannot handle varying group sizes, limiting their effectiveness in multi-person settings. We propose MuPPet, a novel multi-person 2D-to-3D pose lifting framework that explicitly models inter-person correlations. To leverage these inter-person dependencies, our approach introduces Person Encoding to structure individual representations, Permutation Augmentation to enhance training diversity, and Dynamic Multi-Person Attention to adaptively model correlations between individuals. Extensive experiments on group interaction datasets demonstrate MuPPet significantly outperforms state-of-the-art single- and multi-person 2D-to-3D pose lifting methods, and improves robustness in occlusion scenarios. Our findings highlight the importance of modeling inter-person correlations, paving the way for accurate and socially-aware 3D pose estimation. Our code is available at: https://github.com/Thomas-Markhorst/MuPPet

Problem

Research questions and friction points this paper is trying to address.

multi-person

2D-to-3D pose lifting

inter-person relationships

social interactions

3D human pose estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

inter-person correlation

2D-to-3D pose lifting

dynamic attention