🤖 AI Summary
This work addresses the degradation of motion tweening quality caused by imprecise keyframe timing annotations. Existing approaches assume exact keyframe timestamps, yet practical annotations often contain temporal errors. We propose a robust motion sequence generation framework that jointly models an explicit time-warping function and spatial pose residuals. Our architecture incorporates a learnable temporal mapping module to dynamically retiming keyframes, while jointly optimizing temporal coherence and fine-grained pose details. Implemented as an end-to-end trainable generative neural network, the method requires no precise temporal priors. Extensive evaluation across multiple motion datasets demonstrates strong robustness to approximately timed keyframes: generated motions exhibit natural fluidity, plausible rhythm, and rich sub-motions—significantly outperforming fixed-timing baseline methods.
📝 Abstract
Keyframes are a standard representation for kinematic motion specification. Recent learned motion-inbetweening methods use keyframes as a way to control generative motion models, and are trained to generate life-like motion that matches the exact poses and timings of input keyframes. However, the quality of generated motion may degrade if the timing of these constraints is not perfectly consistent with the desired motion. Unfortunately, correctly specifying keyframe timings is a tedious and challenging task in practice. Our goal is to create a system that synthesizes high-quality motion from keyframes, even if keyframes are imprecisely timed. We present a method that allows constraints to be retimed as part of the generation process. Specifically, we introduce a novel model architecture that explicitly outputs a time-warping function to correct mistimed keyframes, and spatial residuals that add pose details. We demonstrate how our method can automatically turn approximately timed keyframe constraints into diverse, realistic motions with plausible timing and detailed submovements.