🤖 AI Summary
Multi-view attributed graph (MVAG) clustering and embedding suffer from low-quality multi-view fusion and high computational overhead. Method: We propose Spectral-Guided Laplacian Aggregation (SGLA+), a unified framework that jointly models structural and attribute information. SGLA+ introduces a novel integrated optimization objective balancing feature separability and graph connectivity, and—crucially—establishes, for the first time, a theoretical linkage between Laplacian matrix spectral properties and community structure. Leveraging spectral graph analysis, constrained optimization, sampling-based approximation, and efficient matrix aggregation, SGLA+ significantly reduces computational complexity. Contribution/Results: Evaluated on eight benchmark datasets, SGLA+ consistently outperforms 12 clustering and 8 embedding baselines in both accuracy and efficiency. It achieves order-of-magnitude speedup in inference time, demonstrating strong scalability for large-scale graph learning applications such as recommendation systems and anomaly detection.
📝 Abstract
A multi-view attributed graph (MVAG) G captures the diverse relationships and properties of real-world entities through multiple graph views and attribute views. Effectively utilizing all views in G is essential for MVAG clustering and embedding, which are important for applications like recommendation systems, anomaly detection, social network analysis, etc. Existing methods either achieve inferior result quality or incur significant computational costs to handle large-scale MVAGs.
In this paper, we present a spectrum-guided Laplacian aggregation scheme with an effective objective formulation and two efficient algorithms SGLA and SGLA+, to cohesively integrate all views of G into an MVAG Laplacian matrix, which readily enables classic graph algorithms to handle G with superior performance in clustering and embedding tasks. We begin by conducting a theoretical analysis to design an integrated objective that consists of two components, the eigengap and connectivity objectives, aiming to link the spectral properties of the aggregated MVAG Laplacian with the underlying community and connectivity properties of G. A constrained optimization problem is then formulated for the integration, which is computationally expensive to solve. Thus, we first develop the SGLA algorithm, which already achieves excellent performance compared with existing methods. To further enhance efficiency, we design SGLA+ to reduce the number of costly objective evaluations via sampling and approximation to quickly find an approximate optimum. Extensive experiments compare our methods against 12 baselines for clustering and 8 baselines for embedding on 8 multi-view attributed graphs, validating the superior performance of SGLA and SGLA+ in terms of result quality and efficiency. Compared with the most effective baselines, our methods are significantly faster, often by up to orders of magnitude.