🤖 AI Summary
Extracting physically interpretable representations from high-dimensional scientific data remains challenging due to difficulties in capturing intrinsic nonlinear structures and ensuring modeling stability. Method: This paper proposes a Gaussian Mixture Variational Autoencoder (GM-VAE) framework. It introduces an EM-inspired stepwise training strategy coupled with block coordinate descent to decouple the E- and M-steps, enhancing convergence stability; additionally, it designs a spectral interpretability metric grounded in graph Laplacian smoothness to align latent-space clusters with physical states. Contribution/Results: The end-to-end model integrates variational inference, mixture modeling, and geometric regularization. Evaluated on surface-reaction ODEs, Navier–Stokes wake flows, and combustion Schlieren images, it yields smooth, physically consistent latent manifolds and significantly improves operational-condition clustering accuracy. The approach establishes a new representation-learning paradigm for complex dynamical systems—achieving compactness, physical interpretability, and generalizability simultaneously.
📝 Abstract
Extracting compact, physically interpretable representations from high-dimensional scientific data is a persistent challenge due to the complex, nonlinear structures inherent in physical systems. We propose a Gaussian Mixture Variational Autoencoder (GM-VAE) framework designed to address this by integrating an Expectation-Maximization (EM)-inspired training scheme with a novel spectral interpretability metric. Unlike conventional VAEs that jointly optimize reconstruction and clustering (often leading to training instability), our method utilizes a block-coordinate descent strategy, alternating between expectation and maximization steps. This approach stabilizes training and naturally aligns latent clusters with distinct physical regimes. To objectively evaluate the learned representations, we introduce a quantitative metric based on graph-Laplacian smoothness, which measures the coherence of physical quantities across the latent manifold. We demonstrate the efficacy of this framework on datasets of increasing complexity: surface reaction ODEs, Navier-Stokes wake flows, and experimental laser-induced combustion Schlieren images. The results show that our GM-VAE yields smooth, physically consistent manifolds and accurate regime clustering, offering a robust data-driven tool for interpreting turbulent and reactive flow systems.