🤖 AI Summary
To address the challenges of representation learning and excessive computational cost in uncertainty quantification (UQ) for high-dimensional data, this paper proposes a VAE-PCE hybrid surrogate model. It employs a variational autoencoder (VAE) to achieve prior-free dimensionality reduction and latent-space modeling of high-dimensional features, coupled with polynomial chaos expansion (PCE) for high-order uncertainty propagation. We introduce, for the first time, an MMD-based learning framework for PCE coefficients, enabling high-order moment matching without assumptions on underlying data distributions. Extensive experiments on multiple high-dimensional benchmark problems demonstrate that the proposed method significantly improves UQ accuracy while accelerating computation by over one order of magnitude compared to conventional UQ approaches. The framework combines theoretical rigor—rooted in variational inference and stochastic spectral methods—with practical deployability in engineering applications.
📝 Abstract
Learning data representations under uncertainty is an important task that emerges in numerous machine learning applications. However, uncertainty quantification (UQ) techniques are computationally intensive and become prohibitively expensive for high-dimensional data. In this paper, we present a novel surrogate model for representation learning and uncertainty quantification, which aims to deal with data of moderate to high dimensions. The proposed model combines a neural network approach for dimensionality reduction of the (potentially high-dimensional) data, with a surrogate model method for learning the data distribution. We first employ a variational autoencoder (VAE) to learn a low-dimensional representation of the data distribution. We then propose to harness polynomial chaos expansion (PCE) formulation to map this distribution to the output target. The coefficients of PCE are learned from the distribution representation of the training data using a maximum mean discrepancy (MMD) approach. Our model enables us to (a) learn a representation of the data, (b) estimate uncertainty in the high-dimensional data system, and (c) match high order moments of the output distribution; without any prior statistical assumptions on the data. Numerical experimental results are presented to illustrate the performance of the proposed method.