🤖 AI Summary
This study investigates the systematic introduction and amplification of bias in memory-augmented personalized AI recruitment agents. Addressing the gap in prior work—which overlooks how memory modules dynamically accumulate bias—we propose the first analytical framework for bias propagation pathways specific to memory modules. Our methodology integrates safety-finetuned LLMs, a memory-augmented agent architecture tailored to simulated recruitment tasks, and a multi-stage bias quantification protocol. Experimental results demonstrate that memory mechanisms significantly exacerbate gender and racial biases, with bias intensifying progressively across interaction rounds—validating the pervasive risk of “memory-driven bias amplification.” Our core contribution is the first systematic identification of novel bias propagation pathways induced by memory-augmented personalization, alongside empirical validation of both the necessity and feasibility of a dedicated memory-layer bias mitigation framework.
📝 Abstract
Large Language Models (LLMs) have empowered AI agents with advanced capabilities for understanding, reasoning, and interacting across diverse tasks. The addition of memory further enhances them by enabling continuity across interactions, learning from past experiences, and improving the relevance of actions and responses over time; termed as memory-enhanced personalization. Although such personalization through memory offers clear benefits, it also introduces risks of bias. While several previous studies have highlighted bias in ML and LLMs, bias due to memory-enhanced personalized agents is largely unexplored. Using recruitment as an example use case, we simulate the behavior of a memory-enhanced personalized agent, and study whether and how bias is introduced and amplified in and across various stages of operation. Our experiments on agents using safety-trained LLMs reveal that bias is systematically introduced and reinforced through personalization, emphasizing the need for additional protective measures or agent guardrails in memory-enhanced LLM-based AI agents.