Problem
Research questions and friction points this paper is trying to address.
Learning latent representations of 3D volumetric shapes
Avoiding trivial solutions in contrastive loss minimization
Integrating multi-modal input for better reconstruction and classification