🤖 AI Summary
Unsupervised or semi-supervised disentanglement and interpretation of generative factors in high-dimensional scientific data remain challenging. Method: We propose Aux-VAE, a variational autoencoder architecture that incorporates auxiliary variables—such as physical parameters—to guide statistical disentanglement in the latent space. Without substantially modifying the standard VAE objective, Aux-VAE aligns latent variables with auxiliary information via its encoder-decoder structure, thereby effectively injecting domain-specific priors to enhance both disentanglement and interpretability. Contribution/Results: Experiments on multi-source scientific datasets—including astrophysical simulations—demonstrate that Aux-VAE significantly outperforms baseline models across key metrics: disentanglement quality (e.g., MIG, SAP), generalization to downstream tasks, and physical mechanism interpretability. Aux-VAE establishes a new generative modeling paradigm for scientific discovery, balancing statistical rigor with domain adaptability.
📝 Abstract
This study addresses the challenge of statistically extracting generative factors from complex, high-dimensional datasets in unsupervised or semi-supervised settings. We investigate encoder-decoder-based generative models for nonlinear dimensionality reduction, focusing on disentangling low-dimensional latent variables corresponding to independent physical factors. Introducing Aux-VAE, a novel architecture within the classical Variational Autoencoder framework, we achieve disentanglement with minimal modifications to the standard VAE loss function by leveraging prior statistical knowledge through auxiliary variables. These variables guide the shaping of the latent space by aligning latent factors with learned auxiliary variables. We validate the efficacy of Aux-VAE through comparative assessments on multiple datasets, including astronomical simulations.