🤖 AI Summary
Existing motion generation methods suffer from deficiencies in naturalness, smoothness, and plausibility, while mainstream evaluation metrics poorly correlate with human visual perception. To address this, we propose MotionPercept—the first large-scale, crowdsourced benchmark explicitly designed for human motion perception assessment. We further introduce MotionCritic, an end-to-end differentiable evaluation model that formalizes subjective perceptual judgments as an optimizable objective, thereby departing from conventional paradigms reliant on ground-truth supervision or statistical distance measures. MotionCritic integrates multi-scale motion feature extraction, adversarial discriminative training, and joint optimization of generation and evaluation. Experiments demonstrate that motion generators guided by MotionCritic achieve a 12.6% improvement in human preference scores. Both the MotionPercept dataset and MotionCritic code are publicly released, advancing the paradigm shift toward perception-aligned motion generation.
📝 Abstract
Human motion generation is a critical task with a wide range of applications. Achieving high realism in generated motions requires naturalness, smoothness, and plausibility. Despite rapid advancements in the field, current generation methods often fall short of these goals. Furthermore, existing evaluation metrics typically rely on ground-truth-based errors, simple heuristics, or distribution distances, which do not align well with human perceptions of motion quality. In this work, we propose a data-driven approach to bridge this gap by introducing a large-scale human perceptual evaluation dataset, MotionPercept, and a human motion critic model, MotionCritic, that capture human perceptual preferences. Our critic model offers a more accurate metric for assessing motion quality and could be readily integrated into the motion generation pipeline to enhance generation quality. Extensive experiments demonstrate the effectiveness of our approach in both evaluating and improving the quality of generated human motions by aligning with human perceptions. Code and data are publicly available at https://motioncritic.github.io/.