🤖 AI Summary
Hierarchical Variational Autoencoders (HVAEs) suffer from representation decay and posterior collapse due to suboptimal allocation of latent dimensions across layers.
Method: We propose the first information-theoretic framework for optimizing latent dimension allocation in HVAEs. Under a fixed total latent dimension budget, we theoretically prove the existence of an optimal inter-layer dimension ratio ( r^* ), and establish a quantitative trade-off model between information preservation and representation decay—yielding interpretable, reusable design principles for hierarchical architecture. Our method integrates hierarchical variational inference with latent capacity control and introduces an out-of-distribution (OOD) confidence score driven by reconstruction–prior consistency.
Results: On multiple benchmarks, our approach achieves an average 3.2% improvement in AUROC, significantly mitigates posterior collapse, outperforms standard HVAEs and state-of-the-art baselines, and demonstrates cross-architecture generalizability.
📝 Abstract
Out-of-distribution (OOD) detection is a critical task in machine learning, particularly for safety-critical applications where unexpected inputs must be reliably flagged. While hierarchical variational autoencoders (HVAEs) offer improved representational capacity over traditional VAEs, their performance is highly sensitive to how latent dimensions are distributed across layers. Existing approaches often allocate latent capacity arbitrarily, leading to ineffective representations or posterior collapse. In this work, we introduce a theoretically grounded framework for optimizing latent dimension allocation in HVAEs, drawing on principles from information theory to formalize the trade-off between information loss and representational attenuation. We prove the existence of an optimal allocation ratio $r^{ast}$ under a fixed latent budget, and empirically show that tuning this ratio consistently improves OOD detection performance across datasets and architectures. Our approach outperforms baseline HVAE configurations and provides practical guidance for principled latent structure design, leading to more robust OOD detection with deep generative models.