Unifying Proportional Fairness in Centroid and Non-Centroid Clustering

📅 2026-01-01

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work presents the first unified treatment of centroid-based and non-centroid-based notions of proportional fairness in clustering. It introduces a “semi-centroid clustering” framework that integrates both types of loss functions and systematically investigates the core and fully justified representation (FJR) fairness criteria. Leveraging combinatorial optimization and approximation algorithms, the authors design a polynomial-time constant-factor approximation algorithm for the core under general settings. Under restricted loss functions, they achieve stronger approximation guarantees for the FJR criterion and establish corresponding theoretical lower bounds. This study provides a unified modeling approach and efficient algorithms for fair clustering across diverse distance metrics.

Technology Category

Application Category

📝 Abstract

Proportional fairness criteria inspired by democratic ideals of proportional representation have received growing attention in the clustering literature. Prior work has investigated them in two separate paradigms. Chen et al. [ICML 2019] study centroid clustering, in which each data point's loss is determined by its distance to a representative point (centroid) chosen in its cluster. Caragiannis et al. [NeurIPS 2024] study non-centroid clustering, in which each data point's loss is determined by its maximum distance to any other data point in its cluster. We generalize both paradigms to introduce semi-centroid clustering, in which each data point's loss is a combination of its centroid and non-centroid losses, and study two proportional fairness criteria -- the core and, its relaxation, fully justified representation (FJR). Our main result is a novel algorithm which achieves a constant approximation to the core, in polynomial time, even when the distance metrics used for centroid and non-centroid loss measurements are different. We also derive improved results for more restricted loss functions and the weaker FJR criterion, and establish lower bounds in each case.

Problem

Research questions and friction points this paper is trying to address.

proportional fairness

centroid clustering

non-centroid clustering

core

fully justified representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

semi-centroid clustering

proportional fairness

core