MotionGRPO: Overcoming Low Intra-Group Diversity in GRPO-Based Egocentric Motion Recovery

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work addresses the challenge in diffusion-based full-body 3D human motion recovery from head-mounted device signals, where global distribution matching often compromises local joint constraints. To mitigate this, the authors formulate diffusion sampling as a Markov decision process and introduce Grouped Relative Policy Optimization (GRPO), a reinforcement learning-based post-training algorithm. GRPO employs a hybrid reward mechanism to provide fine-grained guidance and incorporates a noise injection strategy to enhance intra-group sample diversity, thereby alleviating gradient vanishing. Furthermore, the method integrates a condition-aware model with explicit joint constraints to simultaneously ensure global plausibility and improve local accuracy. Experimental results demonstrate that the proposed approach achieves state-of-the-art performance in both visual fidelity and joint precision, significantly outperforming existing methods.
📝 Abstract
This paper studies full-body 3D human motion recovery from head-mounted device signals. Existing diffusion-based methods often rely on global distribution matching, leading to local joint reconstruction errors. We propose MotionGRPO, a novel framework leveraging reinforcement learning post-training to inject fine-grained guidance into the diffusion process. Technically, we model diffusion sampling as a Markov decision process optimized via Group Relative Policy Optimization (GRPO). To this end, we introduce a hybrid reward mechanism that combines a learned conditioned perceptual model for global visual plausibility and explicit constraints for local joint precision. Our key technical insight is that policy optimization in diffusion-based recovery suffers from vanishing gradients due to limited intra-group sample diversity. To address this, we further introduce a noise-injection strategy that explicitly increases sample variance and stabilizes learning. Extensive experiments demonstrate that MotionGRPO achieves state-of-the-art performance with superior visual fidelity
Problem

Research questions and friction points this paper is trying to address.

3D human motion recovery
diffusion-based methods
intra-group diversity
local joint reconstruction
egocentric motion
Innovation

Methods, ideas, or system contributions that make the work stand out.

MotionGRPO
Group Relative Policy Optimization
diffusion-based motion recovery
noise injection
hybrid reward mechanism
🔎 Similar Papers
No similar papers found.