🤖 AI Summary
Large language models (LLMs) tend to amplify gender bias in retrieval and ranking tasks, compromising fairness in search engines and recommendation systems. To address this, we propose a debiasing framework built upon the Backpack architecture: leveraging its non-contextualized, polysemous word embeddings, it represents each word as a weighted combination of semantic senses, and identifies and suppresses gender-associated senses via bias sensitivity analysis. This approach enables fine-grained, feature-level decoupling and modulation of gender bias without fine-tuning the backbone LLM. Experiments on standard retrieval benchmarks—including MSMARCO—demonstrate that our method reduces gender bias by an average of 38.2% while preserving over 98.5% of the original ranking accuracy, thereby achieving a strong balance between fairness and effectiveness.
📝 Abstract
The presence of social biases in large language models (LLMs) has become a significant concern in AI research. These biases, often embedded in training data, can perpetuate harmful stereotypes and distort decision-making processes. When LLMs are integrated into ranking systems, they can propagate these biases, leading to unfair outcomes in critical applications such as search engines and recommendation systems. Backpack Language Models, unlike traditional transformer-based models that treat text sequences as monolithic structures, generate outputs as weighted combinations of non-contextual, learned word aspects, also known as senses. Leveraging this architecture, we propose a framework for debiasing ranking tasks. Our experimental results show that this framework effectively mitigates gender bias in text retrieval and ranking with minimal degradation in performance.