🤖 AI Summary
This paper addresses fairness risks—such as “fairwashing”—in learning-to-rank systems arising from query polarity disparities, proposing DistFaiR, a distribution-level fairness metric. Methodologically, it introduces the first amortized attention-relevance distribution alignment objective over query sequences, integrating Jensen–Shannon (JS) and Total Variation (TV) divergences to quantify distributional discrepancies, coupled with query-aware polarity modeling and integer linear programming for verifiable dynamic fair ranking. Theoretically, it establishes an upper bound linking individual-distribution fairness to group fairness, enhancing reliability of fairness guarantees. Empirically, DistFaiR achieves significant fairness improvement (+12.7% AvgFair) across multiple benchmark datasets while preserving ranking utility (NDCG@10 degradation < 0.5%), and individual fairness optimization inherently promotes group fairness enhancement.
📝 Abstract
Machine learning-driven rankings, where individuals (or items) are ranked in response to a query, mediate search exposure or attention in a variety of safety-critical settings. Thus, it is important to ensure that such rankings are fair. Under the goal of equal opportunity, attention allocated to an individual on a ranking interface should be proportional to their relevance across search queries. In this work, we examine amortized fair ranking -- where relevance and attention are cumulated over a sequence of user queries to make fair ranking more feasible in practice. Unlike prior methods that operate on expected amortized attention for each individual, we define new divergence-based measures for attention distribution-based fairness in ranking (DistFaiR), characterizing unfairness as the divergence between the distribution of attention and relevance corresponding to an individual over time. This allows us to propose new definitions of unfairness, which are more reliable at test time. Second, we prove that group fairness is upper-bounded by individual fairness under this definition for a useful class of divergence measures, and experimentally show that maximizing individual fairness through an integer linear programming-based optimization is often beneficial to group fairness. Lastly, we find that prior research in amortized fair ranking ignores critical information about queries, potentially leading to a fairwashing risk in practice by making rankings appear more fair than they actually are.