🤖 AI Summary
This study addresses the lack of Bayesian methods capable of simultaneously inferring both the number and assignment of clusters for covariance matrix data, such as functional brain connectivity. To this end, the authors propose the MFM-Wishart model, which uniquely combines a mixture of finite mixtures (MFM) prior with the Wishart distribution to enable Bayesian clustering of covariance matrices. The approach facilitates joint posterior inference on the number of clusters and their assignments, and enjoys theoretical guarantees of posterior consistency for the number of clusters and posterior contraction of the mixing measure. Empirical evaluations using an efficient MCMC algorithm demonstrate that the method accurately recovers true clustering structures even under model misspecification. Application to infant fNIRS functional connectivity data reveals heterogeneous patterns with meaningful neuroscientific interpretations.
📝 Abstract
Data represented as covariance-type matrices arise in many fields, including brain functional connectivity and diffusion tensor imaging. We develop the MFM-Wishart, a Bayesian model-based clustering approach for such data that combines Wishart mixture components with a mixture-of-finite-mixtures (MFM) prior, allowing joint posterior inference on both the number of clusters and clustering assignments. Theoretically, we study the properties of Wishart kernels in the context of mixture models and then establish results for posterior consistency for the number of clusters and posterior contraction of the mixing measure under standard regularity conditions. Computationally, we develop an efficient Markov chain Monte Carlo (MCMC) algorithm for posterior inference. Simulation studies show competitive clustering performance and accurate recovery of the number of clusters, even under model misspecification. We apply MFM-Wishart to cluster infants based on functional connectivity during sleep, estimated from functional near-infrared spectroscopy (fNIRS) data, illustrating the practical utility of the model and revealing interpretable heterogeneity.