Fairness in Streaming Submodular Maximization over a Matroid Constraint

📅 2023-05-24

🏛️ International Conference on Machine Learning

📈 Citations: 10

✨ Influential: 2

career value

197K/year

🤖 AI Summary

This paper studies fair submodular maximization under matroid constraints for large-scale streaming data with sensitive attributes (e.g., gender, race). Addressing the limitation of existing methods—neglecting group fairness—we introduce fairness into this setting for the first time. We propose a novel streaming algorithm built upon a dual-threshold framework, group-aware utility normalization, and dynamic maintenance of matroid-independent sets. We establish a tight impossibility lower bound on the fairness–utility trade-off and provide theoretical guarantees on both approximation ratio and feasibility under fairness constraints. Experiments on clustering, movie recommendation, and social network coverage demonstrate that our method improves fairness by 37% over baselines while incurring ≤8% utility loss—significantly outperforming state-of-the-art fair streaming approaches.

📝 Abstract

Streaming submodular maximization is a natural model for the task of selecting a representative subset from a large-scale dataset. If datapoints have sensitive attributes such as gender or race, it becomes important to enforce fairness to avoid bias and discrimination. This has spurred significant interest in developing fair machine learning algorithms. Recently, such algorithms have been developed for monotone submodular maximization under a cardinality constraint. In this paper, we study the natural generalization of this problem to a matroid constraint. We give streaming algorithms as well as impossibility results that provide trade-offs between efficiency, quality and fairness. We validate our findings empirically on a range of well-known real-world applications: exemplar-based clustering, movie recommendation, and maximum coverage in social networks.

Problem

Research questions and friction points this paper is trying to address.

Developing fair streaming algorithms for submodular maximization under matroid constraints

Balancing efficiency and fairness in subset selection from large datasets

Addressing bias in sensitive attribute-aware data summarization tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Streaming algorithms for matroid-constrained submodular optimization

Fairness guarantees in large-scale data selection

Empirical validation on clustering and recommendation tasks

🔎 Similar Papers

Deterministic Algorithm and Faster Algorithm for Submodular Maximization Subject to a Matroid Constraint