Effective Clustering for Large Multi-Relational Graphs

📅 2025-08-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of poor heterogeneous structural-attribute fusion and limited scalability in multi-relational graph (MRG) node clustering (MRGC), this paper proposes DEMM and its enhanced variant DEMM+. The methods introduce a novel multi-relational Dirichlet energy minimization framework to generate robust node representations, followed by a theoretically grounded problem transformation and sparsification strategy that enables two-stage clustering in linear time—without explicitly constructing dense similarity matrices. Notably, DEMM/DEMM+ supports attribute-free graphs and exhibits strong generalizability. Extensive experiments on 11 real-world MRGs—ranging up to one million nodes and one billion edges—demonstrate significant improvements over 20 state-of-the-art baselines, achieving new SOTA performance in both clustering quality (e.g., NMI, ARI) and computational efficiency.

Technology Category

Application Category

📝 Abstract
Multi-relational graphs (MRGs) are an expressive data structure for modeling diverse interactions/relations among real objects (i.e., nodes), which pervade extensive applications and scenarios. Given an MRG G with N nodes, partitioning the node set therein into K disjoint clusters (MRGC) is a fundamental task in analyzing MRGs, which has garnered considerable attention. However, the majority of existing solutions towards MRGC either yield severely compromised result quality by ineffective fusion of heterogeneous graph structures and attributes, or struggle to cope with sizable MRGs with millions of nodes and billions of edges due to the adoption of sophisticated and costly deep learning models. In this paper, we present DEMM and DEMM+, two effective MRGC approaches to address the limitations above. Specifically, our algorithms are built on novel two-stage optimization objectives, where the former seeks to derive high-caliber node feature vectors by optimizing the multi-relational Dirichlet energy specialized for MRGs, while the latter minimizes the Dirichlet energy of clustering results over the node affinity graph. In particular, DEMM+ achieves significantly higher scalability and efficiency over our based method DEMM through a suite of well-thought-out optimizations. Key technical contributions include (i) a highly efficient approximation solver for constructing node feature vectors, and (ii) a theoretically-grounded problem transformation with carefully-crafted techniques that enable linear-time clustering without explicitly materializing the NxN dense affinity matrix. Further, we extend DEMM+ to handle attribute-less MRGs through non-trivial adaptations. Extensive experiments, comparing DEMM+ against 20 baselines over 11 real MRGs, exhibit that DEMM+ is consistently superior in terms of clustering quality measured against ground-truth labels, while often being remarkably faster.
Problem

Research questions and friction points this paper is trying to address.

Clustering large multi-relational graphs with millions of nodes
Overcoming ineffective fusion of heterogeneous graph structures and attributes
Addressing scalability issues with costly deep learning models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage optimization objectives for clustering
Efficient approximation solver for feature vectors
Linear-time clustering without dense affinity matrix
🔎 Similar Papers
No similar papers found.
X
Xiaoyang Lin
Hong Kong Baptist University, Hong Kong SAR, China
Runhao Jiang
Runhao Jiang
Zhejiang University
Neuromophic ComputingSpiking Neuron NetworkDeep learning
R
Renchi Yang
Hong Kong Baptist University, Hong Kong SAR, China