🤖 AI Summary
Existing overlapping community search (OCS) methods suffer from non-personalized outputs and inefficient machine learning model training. To address these limitations, this paper pioneers the application of deep learning to OCS and proposes a personalized OCS framework. First, we design a Sparse Subspace Filter (SSF) to enable user-aware, personalized community modeling. Second, we introduce a lightweight Multi-hop Attention Network (SMN) that jointly integrates graph neural networks with sparse subspace embeddings—achieving a large receptive field while drastically improving training efficiency. Experimental results demonstrate that our method achieves a 13.73% improvement in F1-score over state-of-the-art baselines and accelerates training by up to three orders of magnitude. These advances yield simultaneous breakthroughs in both accuracy and scalability for personalized overlapping community search.
📝 Abstract
Overlapping Community Search (OCS) identifies nodes that interact with multiple communities based on a specified query. Existing community search approaches fall into two categories: algorithm-based models and Machine Learning-based (ML) models. Despite the long-standing focus on this topic within the database domain, current solutions face two major limitations: 1) Both approaches fail to address personalized user requirements in OCS, consistently returning the same set of nodes for a given query regardless of user differences. 2) Existing ML-based CS models suffer from severe training efficiency issues. In this paper, we formally redefine the problem of OCS. By analyzing the gaps in both types of approaches, we then propose a general solution for OCS named Sparse Subspace Filter (SSF), which can extend any ML-based CS model to enable personalized search in overlapping structures. To overcome the efficiency issue in the current models, we introduce Simplified Multi-hop Attention Networks (SMN), a lightweight yet effective community search model with larger receptive fields. To the best of our knowledge, this is the first ML-based study of overlapping community search. Extensive experiments validate the superior performance of SMN within the SSF pipeline, achieving a 13.73% improvement in F1-Score and up to 3 orders of magnitude acceleration in model efficiency compared to state-of-the-art approaches.