🤖 AI Summary
Overlapping community detection in topologically unknown or partially observable social networks remains challenging, especially when only node metadata is available. Method: This paper proposes META-CODE, an exploratory learning framework that jointly infers network topology and detects overlapping communities through three synergistic components: community-membership embedding, community-aware active querying, and edge connectivity modeling. It integrates Graph Neural Networks (GNNs) with Siamese networks, jointly optimizing embeddings and connectivity predictions via a reconstruction loss. Contribution/Results: META-CODE is the first end-to-end approach unifying metadata-driven community discovery and topology inference. Evaluated on real-world datasets (e.g., Facebook), it achieves a 65.55% improvement in Normalized Mutual Information (NMI) over the best baseline, demonstrating its effectiveness, the rationality of its query strategy, and the convergence of topology inference.
📝 Abstract
In social networks, the discovery of community structures has received considerable attention as a fundamental problem in various network analysis tasks. However, due to privacy concerns or access restrictions, the network structure is often uncertain, thereby rendering established community detection approaches ineffective without costly network topology acquisition. To tackle this challenge, we present META-CODE, a unified framework for detecting overlapping communities via exploratory learning aided by easy-to-collect node metadata when networks are topologically unknown (or only partially known). Specifically, META-CODE consists of three iterative steps in addition to the initial network inference step: 1) node-level community-affiliation embeddings based on graph neural networks (GNNs) trained by our new reconstruction loss, 2) network exploration via community-affiliation-based node queries, and 3) network inference using an edge connectivity-based Siamese neural network model from the explored network. Through extensive experiments on three real-world datasets including two large networks, we demonstrate: (a) the superiority of META-CODE over benchmark community detection methods, achieving remarkable gains up to 65.55% on the Facebook dataset over the best competitor among our selected competitive methods in terms of normalized mutual information (NMI), (b) the impact of each module in META-CODE, (c) the effectiveness of node queries in META-CODE based on empirical evaluations and theoretical findings, and (d) the convergence of the inferred network.