Fair-Count-Min: Frequency Estimation under Equal Group-wise Approximation Factor

πŸ“… 2025-05-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Count-Min (CM) sketches suffer from inter-group unfairness in streaming frequency estimation: groups containing low-frequency elements incur significantly higher expected additive error than those with high-frequency elements. To address this, we propose Group-Aware Semi-Uniform Hashing (GASH), a novel framework that jointly designs group-aware semi-uniform hash functions and column partitioning to theoretically guarantee uniform expected relative error across all groupsβ€”by directly controlling hash collisions. GASH is the first method to formally quantify the accuracy trade-off induced by enforcing fairness. Empirical evaluation on both real-world and synthetic datasets demonstrates that GASH achieves strong inter-group fairness while preserving CM-level time and space efficiency, introducing only negligible additional estimation error.

Technology Category

Application Category

πŸ“ Abstract
Frequency estimation in streaming data often relies on sketches like Count-Min (CM) to provide approximate answers with sublinear space. However, CM sketches introduce additive errors that disproportionately impact low-frequency elements, creating fairness concerns across different groups of elements. We introduce Fair-Count-Min, a frequency estimation sketch that guarantees equal expected approximation factors across element groups, thus addressing the unfairness issue. We propose a column partitioning approach with group-aware semi-uniform hashing to eliminate collisions between elements from different groups. We provide theoretical guarantees for fairness, analyze the price of fairness, and validate our theoretical findings through extensive experiments on real-world and synthetic datasets. Our experimental results show that Fair-Count-Min achieves fairness with minimal additional error and maintains competitive efficiency compared to standard CM sketches.
Problem

Research questions and friction points this paper is trying to address.

Ensures equal approximation fairness across element groups
Eliminates inter-group collisions via group-aware hashing
Balances fairness with minimal error and efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Equal expected approximation factors across groups
Column partitioning with group-aware hashing
Fairness guarantees with minimal error increase
πŸ”Ž Similar Papers
No similar papers found.