🤖 AI Summary
This paper addresses implicit biases in query recommendation systems—particularly along multidimensional sensitive attributes such as gender and geography—within information retrieval. To jointly optimize relevance and fairness, we propose a multi-objective optimization framework that extends BalancedQR by incorporating a Pareto-frontier-driven mechanism to simultaneously model and balance multiple bias dimensions. Our approach integrates query expansion and re-ranking with an interpretable bias quantification model, and conducts bias analysis and fairness validation on the Wikipedia dataset. Experimental results demonstrate that our method significantly reduces multidimensional biases (e.g., gender and geographic bias) while preserving—or even improving—retrieval relevance. Unlike conventional single-dimension debiasing methods, our framework overcomes their inherent limitations and establishes a novel paradigm for building fair and robust retrieval systems.
📝 Abstract
Modern IR systems are an extremely important tool for seeking information. In addition to search, such systems include a number of query reformulation methods, such as query expansion and query recommendations, to provide high quality results. However, results returned by such methods sometimes exhibit undesirable or wrongful bias with respect to protected categories such as gender or race. Our earlier work considered the problem of balanced query recommendation, where instead of re-ranking a list of results based on fairness measures, the goal was to suggest queries that are relevant to a user's search query but exhibit less bias than the original query. In this work, we present a case study of BalancedQR using an extension of BalancedQR that handles biases in multiple dimensions. It employs a Pareto front approach that finds balanced queries, optimizing for multiple objectives such as gender bias and regional bias, along with the relevance of returned results. We evaluate the extended version of BalancedQR on a Wikipedia dataset.Our results demonstrate the effectiveness of our extension to BalancedQR framework and highlight the significant impact of subtle query wording,linguistic choice on retrieval.