๐ค AI Summary
Conventional covariance and precision matrices fail to capture conditional independence structures for generalized non-normal distributionsโi.e., non-Gaussian distributions obtained by unknown diagonal transformations of Gaussian variables.
Method: This paper introduces a generalized non-normal model framework and proves that, under mild identifiability conditions, the precision matrix of its latent Gaussian embedding still exactly encodes conditional independence among observed variables. Building on this, we propose a distribution-free two-stage estimation procedure: first estimating the inverse transformation via nonparametric function estimation, then jointly learning a sparse precision matrix to recover the underlying dependency structure.
Contribution/Results: The method is both interpretable and robust. On synthetic data, it accurately recovers the true graphical structure; on multiple real-world datasets, it successfully identifies key variable relationships. By relaxing the Gaussianity assumption, our approach substantially extends the applicability of graphical models to non-Gaussian settings.
๐ Abstract
For general non-Gaussian distributions, the covariance and precision matrices do not encode the independence structure of the variables, as they do for the multivariate Gaussian. This paper builds on previous work to show that for a class of non-Gaussian distributions -- those derived from diagonal transformations of a Gaussian -- information about the conditional independence structure can still be inferred from the precision matrix, provided the data meet certain criteria, analogous to the Gaussian case. We call such transformations of the Gaussian as the generalized nonparanormal. The functions that define these transformations are, in a broad sense, arbitrary. We also provide a simple and computationally efficient algorithm that leverages this theory to recover conditional independence structure from the generalized nonparanormal data. The effectiveness of the proposed algorithm is demonstrated via synthetic experiments and applications to real-world data.