🤖 AI Summary
This work addresses unsupervised disentanglement of shape and deformation factors for groups of deformable 3D objects (e.g., humans, animals, faces). We propose a decoupled latent-variable optimization framework that jointly trains a generative network with two permutation-invariant PointNet encoders. Leveraging a two-stage training strategy and custom regularization, our method achieves shape–deformation separation without paired annotations. The core innovation lies in enforcing structural disentanglement directly in the latent space, enabling deformation transfer, fine-grained classification, and interpretable analysis. Extensive evaluation on multiple benchmark datasets demonstrates effectiveness: downstream task performance matches or surpasses that of complex supervised or strongly assumption-driven approaches. Our method significantly advances the practicality and generalizability of unsupervised 3D deformation modeling.
📝 Abstract
In this work, we propose a disentangled latent optimization-based method for parameterizing grouped deforming 3D objects into shape and deformation factors in an unsupervised manner. Our approach involves the joint optimization of a generator network along with the shape and deformation factors, supported by specific regularization techniques. For efficient amortized inference of disentangled shape and deformation codes, we train two order-invariant PoinNet-based encoder networks in the second stage of our method. We demonstrate several significant downstream applications of our method, including unsupervised deformation transfer, deformation classification, and explainability analysis. Extensive experiments conducted on 3D human, animal, and facial expression datasets demonstrate that our simple approach is highly effective in these downstream tasks, comparable or superior to existing methods with much higher complexity.