🤖 AI Summary
This study exposes implicit gender and racial biases in vision-language models (VLMs)—including CLIP and OpenCLIP—when performing medical occupation recognition, posing risks of hiring discrimination, distorted healthcare labor force analytics, and erosion of patient trust. To address this, we introduce the first bias evaluation framework tailored to medical contexts, comprising a fine-grained occupational taxonomy, occupation-perception prompt templates, and a balanced multi-ethnic facial image dataset; we further design a zero-shot classification probe to quantify demographic bias. Experiments reveal pervasive systematic biases across mainstream VLMs—for instance, “surgeon” is strongly associated with Indian male faces, while “speech therapist” is disproportionately linked to White female faces. This work constitutes the first systematic diagnosis of demographic bias in medical VLMs, establishing a reproducible benchmark and methodological foundation for fair AI in health workforce applications.
📝 Abstract
Vision language models (VLMs), such as CLIP and OpenCLIP, can encode and reflect stereotypical associations between medical professions and demographic attributes learned from web-scale data. We present an evaluation protocol for healthcare settings that quantifies associated biases and assesses their operational risk. Our methodology (i) defines a taxonomy spanning clinicians and allied healthcare roles (e.g., surgeon, cardiologist, dentist, nurse, pharmacist, technician), (ii) curates a profession-aware prompt suite to probe model behavior, and (iii) benchmarks demographic skew against a balanced face corpus. Empirically, we observe consistent demographic biases across multiple roles and vision models. Our work highlights the importance of bias identification in critical domains such as healthcare as AI-enabled hiring and workforce analytics can have downstream implications for equity, compliance, and patient trust.