🤖 AI Summary
This work addresses the challenge that variational autoencoders (VAEs) struggle to effectively model the intrinsic non-commutative structure of data, as conventional symmetry constraints may suppress such critical properties. The authors propose a Lie group VAE framework trained in two stages: first, an unconstrained phase diagnoses algebraic non-commutativity in the latent space and sequence sensitivity in the decoder; second, a deformation-stability constraint explicitly aligns these behaviors. This approach is the first to link non-commutativity diagnostics with generative behavior, integrating Lie group geometry, Baker–Campbell–Hausdorff deviation metrics, and sequence-swapping reconstruction tests. Evaluated on dSprites, 3DShapes, 3DCars, and CelebA, the method significantly outperforms baselines, achieving higher reconstruction fidelity, clearer sequential dependency in latent composition, and more stable generation.
📝 Abstract
Variational autoencoders (VAEs) often struggle to represent non-commutative structure in learned latent spaces. Symmetry-aware VAEs commonly address this issue by enforcing commutativity through algebraic regularization, which is appropriate for commutative transformation groups but can suppress meaningful non-commutative structure when it is intrinsic to the data. We argue that non-commutativity should instead be explicitly diagnosed and reflected in reconstruction behavior. We introduce a Lie Group VAE framework that combines geometric and algebraic perspectives on uncertainty while separating discrete generative factors from continuous geometric transformations. In a first phase, the model is trained without structural constraints while algebraic non-commutativity is measured through finite Baker-Campbell-Hausdorff deviations and decoder order sensitivity is measured through reconstruction order-swap tests. These diagnostics reveal a scale mismatch between latent non-commutativity and reconstruction behavior under unconstrained training. In a second phase, we introduce a deformation-stability constraint with a data-driven calibration constant that aligns decoder sensitivity with algebraic non-commutativity. We evaluate the framework on dSprites, 3DShapes, 3DCars, and CelebA against generic and symmetry-aware baselines, including beta-VAE, CLG-VAE, and CFASL. Across synthetic benchmarks, the method improves reconstruction quality and yields decoder-level behavior more consistent with latent non-commutative structure. Qualitative analyses show clearer order-dependent latent compositions and more stable reconstructions. On CelebA, the model yields more faithful reconstructions and factor-specific latent traversals than CFASL, while also exhibiting meaningful order-dependent interactions between learned latent directions.