Fair Clustering with Minimum Representation Constraints

📅 Unknown Date
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work studies fair $k$-means and $k$-medians clustering under *minimum representation constraints*: each sensitive group must constitute at least a specified minimum proportion (e.g., 50%) in at least a prescribed number of clusters, addressing structural fairness requirements in real-world applications. We formally define this constrained optimization problem for the first time. To solve it, we propose MiniReL—an alternating optimization algorithm that integrates mixed-integer nonlinear programming (MINLP) modeling, efficient heuristic assignment strategies, and an alternating minimization framework, jointly optimizing group representation and clustering quality despite NP-hard constraints. Experiments on standard benchmark datasets demonstrate that MiniReL strictly satisfies the fairness guarantees while achieving clustering costs competitive with—and often superior to—state-of-the-art baselines. These results validate MiniReL’s effectiveness, robustness, and scalability.

Technology Category

Application Category

📝 Abstract
Clustering is a well-studied unsupervised learning task that aims to partition data points into a number of clusters. In many applications, these clusters correspond to real-world constructs (e.g., electoral districts, playlists, TV channels), where a group (e.g., social or demographic) benefits only if it reaches a minimum level of representation in the cluster (e.g., 50% to elect their preferred candidate). In this paper, we study the k-means and k-medians clustering problems under the additional fairness constraint that each group must attain a minimum level of representation in at least a specified number of clusters. We formulate this problem as a mixed-integer (nonlinear) optimization problem and propose an alternating minimization algorithm, called MiniReL, to solve it. Although incorporating fairness constraints results in an NP-hard assignment problem within the MiniReL algorithm, we present several heuristic strategies that make the approach practical even for large datasets. Numerical results demonstrate that our method yields fair clusters without increasing clustering cost across standard benchmark datasets.
Problem

Research questions and friction points this paper is trying to address.

Ensures minimum group representation in clustering outcomes
Addresses fairness in k-means and k-medians clustering algorithms
Solves NP-hard assignment with practical heuristics for large datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Minimum representation constraints for fair clustering
Mixed-integer optimization with alternating minimization
Heuristic strategies for NP-hard assignment problem
Connor Lawless
Connor Lawless
Postdoc, Stanford University
O
Oktay Gunluk
Industrial and Systems Engineering, Georgia Institute of Technology