A Case Study of Balanced Query Recommendation on Wikipedia

📅 2025-08-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses implicit biases in query recommendation systems—particularly along multidimensional sensitive attributes such as gender and geography—within information retrieval. To jointly optimize relevance and fairness, we propose a multi-objective optimization framework that extends BalancedQR by incorporating a Pareto-frontier-driven mechanism to simultaneously model and balance multiple bias dimensions. Our approach integrates query expansion and re-ranking with an interpretable bias quantification model, and conducts bias analysis and fairness validation on the Wikipedia dataset. Experimental results demonstrate that our method significantly reduces multidimensional biases (e.g., gender and geographic bias) while preserving—or even improving—retrieval relevance. Unlike conventional single-dimension debiasing methods, our framework overcomes their inherent limitations and establishes a novel paradigm for building fair and robust retrieval systems.

Technology Category

Application Category

📝 Abstract
Modern IR systems are an extremely important tool for seeking information. In addition to search, such systems include a number of query reformulation methods, such as query expansion and query recommendations, to provide high quality results. However, results returned by such methods sometimes exhibit undesirable or wrongful bias with respect to protected categories such as gender or race. Our earlier work considered the problem of balanced query recommendation, where instead of re-ranking a list of results based on fairness measures, the goal was to suggest queries that are relevant to a user's search query but exhibit less bias than the original query. In this work, we present a case study of BalancedQR using an extension of BalancedQR that handles biases in multiple dimensions. It employs a Pareto front approach that finds balanced queries, optimizing for multiple objectives such as gender bias and regional bias, along with the relevance of returned results. We evaluate the extended version of BalancedQR on a Wikipedia dataset.Our results demonstrate the effectiveness of our extension to BalancedQR framework and highlight the significant impact of subtle query wording,linguistic choice on retrieval.
Problem

Research questions and friction points this paper is trying to address.

Addressing bias in query recommendations across multiple dimensions
Optimizing for relevance while reducing gender and regional bias
Studying impact of query wording on balanced information retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pareto front multi-objective optimization for bias reduction
Handling biases in multiple protected dimensions simultaneously
Balancing relevance with gender and regional bias mitigation
🔎 Similar Papers
No similar papers found.
H
Harshit Mishra
Syracuse University, Syracuse, New York, USA
Sucheta Soundarajan
Sucheta Soundarajan
Syracuse University
Data MiningSocial Network AnalysisAlgorithms