Structural Equation-VAE: Disentangled Latent Representations for Tabular Data

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Learning interpretable latent representations for tabular data remains challenging. This paper proposes the Structural Embedding Variational Autoencoder (SE-VAE), the first VAE framework to integrate structural equation modeling (SEM) principles into its architecture. SE-VAE employs a modular latent space design that explicitly aligns with predefined groups of observed variables and introduces global confounder latent variables to isolate confounding effects—enabling *architecture-driven disentangled representation learning*, rather than relying on posterior regularization. The method unifies variational inference, causal modeling, and latent variable decomposition. Extensive synthetic experiments demonstrate that SE-VAE significantly outperforms state-of-the-art baselines in factor recovery accuracy, disentanglement interpretability, and robustness to confounding. These results underscore the critical role of theory-guided architectural design in enhancing the scientific credibility and reliability of generative models for tabular data.

Technology Category

Application Category

📝 Abstract
Learning interpretable latent representations from tabular data remains a challenge in deep generative modeling. We introduce SE-VAE (Structural Equation-Variational Autoencoder), a novel architecture that embeds measurement structure directly into the design of a variational autoencoder. Inspired by structural equation modeling, SE-VAE aligns latent subspaces with known indicator groupings and introduces a global nuisance latent to isolate construct-specific confounding variation. This modular architecture enables disentanglement through design rather than through statistical regularizers alone. We evaluate SE-VAE on a suite of simulated tabular datasets and benchmark its performance against a series of leading baselines using standard disentanglement metrics. SE-VAE consistently outperforms alternatives in factor recovery, interpretability, and robustness to nuisance variation. Ablation results reveal that architectural structure, rather than regularization strength, is the key driver of performance. SE-VAE offers a principled framework for white-box generative modeling in scientific and social domains where latent constructs are theory-driven and measurement validity is essential.
Problem

Research questions and friction points this paper is trying to address.

Learning interpretable latent representations from tabular data
Disentangling latent subspaces with known indicator groupings
Isolating construct-specific confounding variation through modular architecture
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structural Equation-VAE embeds measurement structure into VAE design
Aligns latent subspaces with known indicator groupings
Uses global nuisance latent to isolate confounding variation
Ruiyu Zhang
Ruiyu Zhang
The University of Hong Kong
Public ManagementOrganizationsBureaucracyComputational Social Science
C
Ce Zhao
School of Computer Science, Carnegie Mellon University
X
Xin Zhao
Department of Applied Social Sciences, The Hong Kong Polytechnic University
L
Lin Nie
Department of Applied Social Sciences, The Hong Kong Polytechnic University
W
Wai-Fung Lam
Department of Politics and Public Administration, The University of Hong Kong