🤖 AI Summary
This study addresses the challenge in stochastic block models of simultaneously achieving accurate inference and model selection—particularly the inability of maximum likelihood approaches to automatically determine the number of clusters due to their lack of sparsity. The work establishes, for the first time, a theoretical connection between maximum likelihood variational inference and the semi-relaxed Gromov–Wasserstein optimal transport framework, and proposes a unified optimization scheme that integrates entropy regularization with a sparsity-inducing mechanism. This approach jointly performs parameter estimation and cluster number selection within a single optimization procedure, eliminating the need for grid search or heuristic criteria. Empirical results demonstrate that the method accurately recovers both the connectivity matrix and the underlying cluster structure from finite samples, while automatically identifying the true number of clusters.
📝 Abstract
We study inference in stochastic block models (SBMs) through the lens of optimal transport (OT). We first establish that maximum likelihood variational inference (MLVI) can be interpreted as a semi-relaxed Gromov-Wasserstein (srGW) projection with entropic regularization. While this formulation yields accurate clustering, the entropic regularization prevents transport plans to be sparse, hindering intrinsic model selection. Consequently, we investigate unregularized srGW estimators, and prove that they consistently recover both the SBM connectivity matrix and latent cluster assignments in the asymptotic regime. However, this asymptotic property does not translate into reliable model selection in finite samples, and calls for additional mechanisms to promote sparsity in the inferred cluster proportions. We empirically show that such a regularized formulation yields estimators that simultaneously recover model parameters and select the number of clusters in a single optimization problem, thereby avoiding costly grid search or heuristic model selection procedures.