MotionGRPO: Overcoming Low Intra-Group Diversity in GRPO-Based Egocentric Motion Recovery

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses the challenge in diffusion-based full-body 3D human motion recovery from head-mounted device signals, where global distribution matching often compromises local joint constraints. To mitigate this, the authors formulate diffusion sampling as a Markov decision process and introduce Grouped Relative Policy Optimization (GRPO), a reinforcement learning-based post-training algorithm. GRPO employs a hybrid reward mechanism to provide fine-grained guidance and incorporates a noise injection strategy to enhance intra-group sample diversity, thereby alleviating gradient vanishing. Furthermore, the method integrates a condition-aware model with explicit joint constraints to simultaneously ensure global plausibility and improve local accuracy. Experimental results demonstrate that the proposed approach achieves state-of-the-art performance in both visual fidelity and joint precision, significantly outperforming existing methods.

📝 Abstract

This paper studies full-body 3D human motion recovery from head-mounted device signals. Existing diffusion-based methods often rely on global distribution matching, leading to local joint reconstruction errors. We propose MotionGRPO, a novel framework leveraging reinforcement learning post-training to inject fine-grained guidance into the diffusion process. Technically, we model diffusion sampling as a Markov decision process optimized via Group Relative Policy Optimization (GRPO). To this end, we introduce a hybrid reward mechanism that combines a learned conditioned perceptual model for global visual plausibility and explicit constraints for local joint precision. Our key technical insight is that policy optimization in diffusion-based recovery suffers from vanishing gradients due to limited intra-group sample diversity. To address this, we further introduce a noise-injection strategy that explicitly increases sample variance and stabilizes learning. Extensive experiments demonstrate that MotionGRPO achieves state-of-the-art performance with superior visual fidelity

Problem

Research questions and friction points this paper is trying to address.

3D human motion recovery

diffusion-based methods

intra-group diversity

local joint reconstruction

egocentric motion

Innovation

Methods, ideas, or system contributions that make the work stand out.

MotionGRPO

Group Relative Policy Optimization

diffusion-based motion recovery

noise injection

hybrid reward mechanism

🔎 Similar Papers

No similar papers found.