🤖 AI Summary
Motion capture data often exhibits visually unnatural artifacts—such as motion jitters and frozen frames—due to sensor noise and post-processing errors. Existing approaches rely either on labor-intensive manual cleaning or require paired clean-corrupted training samples, limiting generalizability and practical deployment. This paper proposes the first quality-aware diffusion-based restoration framework that operates without paired data. We introduce a lightweight motion quality indicator—either human-annotated or heuristically generated—to guide an integrated generative-discriminative architecture for end-to-end artifact detection and correction. Our method employs unpaired adversarial training and quality-guided sampling during diffusion inference. Evaluated on SoccerMocap, a real-world 245-hour football motion capture dataset, the approach reduces motion jitters by 68% and frozen frames by 81%, significantly improving animation usability and robustness.
📝 Abstract
Motion capture (mocap) data often exhibits visually jarring artifacts due to inaccurate sensors and post-processing. Cleaning this corrupted data can require substantial manual effort from human experts, which can be a costly and time-consuming process. Previous data-driven motion cleanup methods offer the promise of automating this cleanup process, but often require in-domain paired corrupted-to-clean training data. Constructing such paired datasets requires access to high-quality, relatively artifact-free motion clips, which often necessitates laborious manual cleanup. In this work, we present StableMotion, a simple yet effective method for training motion cleanup models directly from unpaired corrupted datasets that need cleanup. The core component of our method is the introduction of motion quality indicators, which can be easily annotated through manual labeling or heuristic algorithms and enable training of quality-aware motion generation models on raw motion data with mixed quality. At test time, the model can be prompted to generate high-quality motions using the quality indicators. Our method can be implemented through a simple diffusion-based framework, leading to a unified motion generate-discriminate model, which can be used to both identify and fix corrupted frames. We demonstrate that our proposed method is effective for training motion cleanup models on raw mocap data in production scenarios by applying StableMotion to SoccerMocap, a 245-hour soccer mocap dataset containing real-world motion artifacts. The trained model effectively corrects a wide range of motion artifacts, reducing motion pops and frozen frames by 68% and 81%, respectively. See https://youtu.be/3Y7MMAH02B4 for more results.