🤖 AI Summary
This study systematically investigates the impact of external knowledge integration and reasoning processes in Retrieval-Augmented Generation (RAG) systems on social bias. Through extensive experiments across diverse retrieval corpora, large language models, and more than thirteen social bias evaluation benchmarks—augmented with Chain-of-Thought (CoT) prompting and faithfulness analysis—the work reveals for the first time that externally retrieved context can effectively mitigate model bias, whereas CoT, despite improving answer accuracy, significantly exacerbates bias. These findings uncover a critical trade-off between accuracy and fairness, highlighting the need for novel reasoning frameworks that jointly optimize both dimensions while incorporating bias-aware mechanisms.
📝 Abstract
Social biases inherent in large language models (LLMs) raise significant fairness concerns. Retrieval-Augmented Generation (RAG) architectures, which retrieve external knowledge sources to enhance the generative capabilities of LLMs, remain susceptible to the same bias-related challenges. This work focuses on evaluating and understanding the social bias implications of RAG. Through extensive experiments across various retrieval corpora, LLMs, and bias evaluation datasets, encompassing more than 13 different bias types, we surprisingly observe a reduction in bias in RAG. This suggests that the inclusion of external context can help counteract stereotype-driven predictions, potentially improving fairness by diversifying the contextual grounding of the model's outputs. To better understand this phenomenon, we then explore the model's reasoning process by integrating Chain-of-Thought (CoT) prompting into RAG while assessing the faithfulness of the model's CoT. Our experiments reveal that the model's bias inclinations shift between stereotype and anti-stereotype responses as more contextual information is incorporated from the retrieved documents. Interestingly, we find that while CoT enhances accuracy, contrary to the bias reduction observed with RAG, it increases overall bias across datasets, highlighting the need for bias-aware reasoning frameworks that can mitigate this trade-off.