Distributed Gradient Clustering: Convergence and the Effect of Initialization

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of achieving global clustering in distributed networks where each user can communicate only with its neighbors and possesses local data. To this end, the authors propose a novel distributed centroid initialization method inspired by K-means++, integrated with a distributed gradient-based clustering algorithm, enabling effective global clustering under strict local communication constraints. Experimental results demonstrate that the proposed initialization strategy significantly outperforms random initialization, yielding improved clustering performance and enhanced robustness to the choice of initial centroids. Notably, the approach even surpasses centralized gradient clustering methods on certain evaluation metrics, highlighting its effectiveness in decentralized settings.

Technology Category

Application Category

📝 Abstract
We study the effects of center initialization on the performance of a family of distributed gradient-based clustering algorithms introduced in [1], that work over connected networks of users. In the considered scenario, each user contains a local dataset and communicates only with its immediate neighbours, with the aim of finding a global clustering of the joint data. We perform extensive numerical experiments, evaluating the effects of center initialization on the performance of our family of methods, demonstrating that our methods are more resilient to the effects of initialization, compared to centralized gradient clustering [2]. Next, inspired by the $K$-means++ initialization [3], we propose a novel distributed center initialization scheme, which is shown to improve the performance of our methods, compared to the baseline random initialization.
Problem

Research questions and friction points this paper is trying to address.

distributed clustering
gradient-based clustering
center initialization
networked data
K-means++
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed clustering
Gradient-based clustering
Center initialization
K-means++
Networked systems
🔎 Similar Papers
No similar papers found.
A
Aleksandar Armacki
Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
H
Himkant Sharma
Indian Institute of Technology Kharagpur, Kharagpur, India
D
Dragana Bajović
Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia
D
Dušan Jakovetić
Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia
M
Mrityunjoy Chakraborty
Indian Institute of Technology Kharagpur, Kharagpur, India
Soummya Kar
Soummya Kar
Electrical and Computer Engineering, Carnegie Mellon University
Large Scale Stochastic Systems