๐ค AI Summary
This work addresses the challenge of bias propagation in Retrieval-Augmented Generation (RAG), where biases introduced during retrieval can uncontrollably influence the final generation through multi-document aggregation. To mitigate this, the study introduces FARO, the first approach to explicitly model fairness within the RAG retrieval stage. FARO employs a scalable two-stage fairness optimization framework that injects controllable bias via re-ranking, constructs a position-aware bias propagation model, and formulates an optimization objective balancing relevance and fairness. An efficient solver is enabled through dual hyperplane approximation. Experimental results demonstrate that FARO effectively reduces generative bias while preserving high retrieval relevance, offering a principled and practical solution for fair retrieval in RAG systems.
๐ Abstract
Retrieval-Augmented Generation (RAG) improves reliability of large language models by incorporating external knowledge, but the retrieval process can introduce bias that propagates to generated outputs. This issue is particularly challenging in top-k settings, where multiple documents jointly influence generation. We propose a fairness-aware retrieval framework that models and controls this bias. Our approach combines controlled bias injection via reranking, a position-aware model of bias propagation, and an optimization formulation that balances relevance and fairness. We further introduce a scalable solution based on Quadratic Fairness via Dual Hyperplane Approximation (FARO), which enables efficient optimization through problem decomposition. Experimental results show that our method effectively mitigates generation bias while preserving relevance. This work provides a principled approach for fairness-aware retrieval in RAG systems.