🤖 AI Summary
This study addresses the risk of gender bias propagation in large language models (LLMs) within sensitive domains such as healthcare, where existing bias evaluation frameworks often overlook the interactions among social determinants of health (SDoH) and their contextual dependencies. For the first time, this work employs multidimensional SDoH interactions as probing constructs, integrating prompt engineering and controlled probing experiments to systematically analyze stereotypical behaviors in prominent LLMs when gender co-occurs with other SDoH factors in French clinical texts. The findings demonstrate that LLMs activate gendered stereotypes in response to SDoH-related inputs, thereby confirming that incorporating SDoH interactions significantly enhances and refines current bias assessment methodologies.
📝 Abstract
Large Language Models (LLMs) excel in Natural Language Processing (NLP) tasks, but they often propagate biases embedded in their training data, which is potentially impactful in sensitive domains like healthcare. While existing benchmarks evaluate biases related to individual social determinants of health (SDoH) such as gender or ethnicity, they often overlook interactions between these factors and lack context-specific assessments. This study investigates bias in LLMs by probing the relationships between gender and other SDoH in French patient records. Through a series of experiments, we found that embedded stereotypes can be probed using SDoH input and that LLMs rely on embedded stereotypes to make gendered decisions, suggesting that evaluating interactions among SDoH factors could usefully complement existing approaches to assessing LLM performance and bias.