🤖 AI Summary
This work addresses the challenge of 3D understanding of diverse motions—rigid, deformable, and articulated—in monocular videos. We propose a template-free, structure-agnostic dynamic 3D sketch reconstruction method. Our core innovation is the first introduction of sparse, smooth, deformable parametric 3D curves as universal motion primitives, enabling semantic-driven, unified abstraction and representation of motion. The method leverages frame-level semantic features to guide the generation of noisy 3D motion point clouds; these are then refined via curve deformation optimization and motion-smoothness regularization within an end-to-end differentiable framework. Evaluated across multiple motion categories, our approach significantly improves fidelity in dynamic 3D sketch reconstruction and robustness in motion-part segmentation. It supports direct 3D motion analysis and visualization, establishing a novel paradigm for unsupervised 3D motion understanding.
📝 Abstract
Understanding 3D motion from videos presents inherent challenges due to the diverse types of movement, ranging from rigid and deformable objects to articulated structures. To overcome this, we propose Liv3Stroke, a novel approach for abstracting objects in motion with deformable 3D strokes. The detailed movements of an object may be represented by unstructured motion vectors or a set of motion primitives using a pre-defined articulation from a template model. Just as a free-hand sketch can intuitively visualize scenes or intentions with a sparse set of lines, we utilize a set of parametric 3D curves to capture a set of spatially smooth motion elements for general objects with unknown structures. We first extract noisy, 3D point cloud motion guidance from video frames using semantic features, and our approach deforms a set of curves to abstract essential motion features as a set of explicit 3D representations. Such abstraction enables an understanding of prominent components of motions while maintaining robustness to environmental factors. Our approach allows direct analysis of 3D object movements from video, tackling the uncertainty that typically occurs when translating real-world motion into recorded footage. The project page is accessible via: https://jaeah.me/liv3stroke_web