A Unified Framework for Community Detection and Model Selection in Blockmodels

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This paper addresses the long-standing coupling challenge between community detection and block model selection (e.g., SBM, DCBM, PABM) in network analysis. Methodologically, we propose a unified statistical inference framework featuring a novel loss function that jointly encodes optimization objectives and hypothesis testing criteria, integrated with spectral-geometric initialization and a greedy search algorithm to simultaneously optimize community partitioning and model selection. We establish asymptotic guarantees for both exact label recovery and consistent model selection under mild regularity conditions. Experiments demonstrate that our method matches or surpasses state-of-the-art approaches in both community detection accuracy and model selection fidelity on synthetic benchmarks, and delivers strong interpretability and practical utility across five real-world networks. The key innovation lies in the first principled incorporation of model selection directly into the loss function design, thereby establishing a unified paradigm that jointly achieves statistical consistency and computational efficiency.

Technology Category

Application Category

📝 Abstract

Blockmodels are a foundational tool for modeling community structure in networks, with the stochastic blockmodel (SBM), degree-corrected blockmodel (DCBM), and popularity-adjusted blockmodel (PABM) forming a natural hierarchy of increasing generality. While community detection under these models has been extensively studied, much less attention has been paid to the model selection problem, i.e., determining which model best fits a given network. Building on recent theoretical insights about the spectral geometry of these models, we propose a unified framework for simultaneous community detection and model selection across the full blockmodel hierarchy. A key innovation is the use of loss functions that serve a dual role: they act as objective functions for community detection and as test statistics for hypothesis testing. We develop a greedy algorithm to minimize these loss functions and establish theoretical guarantees for exact label recovery and model selection consistency under each model. Extensive simulation studies demonstrate that our method achieves high accuracy in both tasks, outperforming or matching state-of-the-art alternatives. Applications to five real-world networks further illustrate the interpretability and practical utility of our approach.

Problem

Research questions and friction points this paper is trying to address.

Determining the best-fitting blockmodel for a given network

Unifying community detection and model selection in blockmodels

Developing a dual-role loss function for accuracy and testing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified framework for community detection and model selection

Dual-role loss functions for detection and hypothesis testing

Greedy algorithm ensures exact label recovery consistency

🔎 Similar Papers

Improved Community Detection using Stochastic Block Models