🤖 AI Summary
Learning interpretable latent representations for tabular data remains challenging. This paper proposes the Structural Embedding Variational Autoencoder (SE-VAE), the first VAE framework to integrate structural equation modeling (SEM) principles into its architecture. SE-VAE employs a modular latent space design that explicitly aligns with predefined groups of observed variables and introduces global confounder latent variables to isolate confounding effects—enabling *architecture-driven disentangled representation learning*, rather than relying on posterior regularization. The method unifies variational inference, causal modeling, and latent variable decomposition. Extensive synthetic experiments demonstrate that SE-VAE significantly outperforms state-of-the-art baselines in factor recovery accuracy, disentanglement interpretability, and robustness to confounding. These results underscore the critical role of theory-guided architectural design in enhancing the scientific credibility and reliability of generative models for tabular data.
📝 Abstract
Learning interpretable latent representations from tabular data remains a challenge in deep generative modeling. We introduce SE-VAE (Structural Equation-Variational Autoencoder), a novel architecture that embeds measurement structure directly into the design of a variational autoencoder. Inspired by structural equation modeling, SE-VAE aligns latent subspaces with known indicator groupings and introduces a global nuisance latent to isolate construct-specific confounding variation. This modular architecture enables disentanglement through design rather than through statistical regularizers alone. We evaluate SE-VAE on a suite of simulated tabular datasets and benchmark its performance against a series of leading baselines using standard disentanglement metrics. SE-VAE consistently outperforms alternatives in factor recovery, interpretability, and robustness to nuisance variation. Ablation results reveal that architectural structure, rather than regularization strength, is the key driver of performance. SE-VAE offers a principled framework for white-box generative modeling in scientific and social domains where latent constructs are theory-driven and measurement validity is essential.