Integrating Random Effects in Variational Autoencoders for Dimensionality Reduction of Correlated Data

πŸ“… 2024-12-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

214K/year
πŸ€– AI Summary
Traditional VAEs assume independent observations, limiting their ability to model intrinsically correlated dataβ€”such as spatiotemporal or clustered structures. To address this, we propose REM-VAE, a VAE framework integrating random effects. Its core innovation is the first incorporation of linear mixed-model principles into VAEs: it explicitly disentangles fixed effects (individual-specific latent variables) from random effects (cluster-level correlations) and employs a learnable covariance structure to capture inter-sample dependencies. Accordingly, we reformulate the ELBO objective to incorporate a structured prior over random effects. Evaluated on synthetic benchmarks and multiple real-world spatiotemporal and population datasets, REM-VAE achieves significant improvements: βˆ’12.3% reduction in reconstruction error, βˆ’9.7% decrease in negative log-likelihood, and an average +4.8% gain in downstream classification accuracy. These results demonstrate REM-VAE’s effectiveness and generalization advantage in modeling correlated data.

Technology Category

Application Category

πŸ“ Abstract
Variational Autoencoders (VAE) are widely used for dimensionality reduction of large-scale tabular and image datasets, under the assumption of independence between data observations. In practice, however, datasets are often correlated, with typical sources of correlation including spatial, temporal and clustering structures. Inspired by the literature on linear mixed models (LMM), we propose LMMVAE -- a novel model which separates the classic VAE latent model into fixed and random parts. While the fixed part assumes the latent variables are independent as usual, the random part consists of latent variables which are correlated between similar clusters in the data such as nearby locations or successive measurements. The classic VAE architecture and loss are modified accordingly. LMMVAE is shown to improve squared reconstruction error and negative likelihood loss significantly on unseen data, with simulated as well as real datasets from various applications and correlation scenarios. It also shows improvement in the performance of downstream tasks such as supervised classification on the learned representations.
Problem

Research questions and friction points this paper is trying to address.

Variational Autoencoder (VAE)
Intrinsic Correlation
Performance Enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

LMMVAE
Linear Mixed Models
Variational Autoencoders