Multiple Invertible and Partial-Equivariant Function for Latent Vector Transformation to Enhance Disentanglement in VAEs

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address the lack of explicit inductive bias for disentangled representation learning in variational autoencoders (VAEs), this paper proposes the MIPE transformation framework. Methodologically, MIPE introduces invertible and partially equivariant latent-space IPE (Invertible Partially Equivariant) transformations to ensure structural invertibility and partial input–latent equivariance; integrates exponential family (EF) parameterization to support flexible prior and posterior modeling; and enables, for the first time, end-to-end joint optimization of multi-unit IPE and EF components. The approach synergistically unifies invertible neural networks, equivariance constraints, exponential family distributions, and variational inference into a multi-module co-learning architecture. Evaluated on 3D Cars, 3D Shapes, and dSprites, MIPE achieves significant improvements in disentanglement metrics—including MIG, SAP, and DCI—outperforming state-of-the-art models such as β-VAE and FactorVAE.

Technology Category

Application Category

📝 Abstract

Disentanglement learning is a core issue for understanding and re-using trained information in Variational AutoEncoder (VAE), and effective inductive bias has been reported as a key factor. However, the actual implementation of such bias is still vague. In this paper, we propose a novel method, called Multiple Invertible and partial-equivariant transformation (MIPE-transformation), to inject inductive bias by 1) guaranteeing the invertibility of latent-to-latent vector transformation while preserving a certain portion of equivariance of input-to-latent vector transformation, called Invertible and partial-equivariant transformation (IPE-transformation), 2) extending the form of prior and posterior in VAE frameworks to an unrestricted form through a learnable conversion to an approximated exponential family, called Exponential Family conversion (EF-conversion), and 3) integrating multiple units of IPE-transformation and EF-conversion, and their training. In experiments on 3D Cars, 3D Shapes, and dSprites datasets, MIPE-transformation improves the disentanglement performance of state-of-the-art VAEs.

Problem

Research questions and friction points this paper is trying to address.

Enhancing disentanglement in VAEs

Implementing inductive bias effectively

Improving latent vector transformation methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Invertible partial-equivariant latent transformation

Learnable exponential family conversion

Integration of multiple transformation units

🔎 Similar Papers

A Markov Random Field Multi-Modal Variational AutoEncoder