Investigating Associational Biases in Inter-Model Communication of Large Generative Models

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses how large generative models acquire and propagate stereotypical associations even in the absence of explicit demographic information, with such biases potentially amplified during inter-model communication, thereby compromising fairness in human-centric perception tasks. The authors construct an alternating image-generation and text-description communication pipeline to systematically quantify, for the first time, the demographic distribution shifts induced by such communication. Using the RAF-DB and PHASE datasets alongside interpretability analyses, they reveal that the communication process systematically skews representations toward younger, more feminized attributes and relies on spurious visual cues—such as background or hairstyle—for activity and emotion prediction. Based on these findings, the work proposes mitigation strategies spanning data curation, training, and deployment phases, effectively reducing the downstream impact of such biases.

Technology Category

Application Category

📝 Abstract
Social bias in generative AI can manifest not only as performance disparities but also as associational bias, whereby models learn and reproduce stereotypical associations between concepts and demographic groups, even in the absence of explicit demographic information (e.g., associating doctors with men). These associations can persist, propagate, and potentially amplify across repeated exchanges in inter-model communication pipelines, where one generative model's output becomes another's input. This is especially salient for human-centred perception tasks, such as human activity recognition and affect prediction, where inferences about behaviour and internal states can lead to errors or stereotypical associations that propagate into unequal treatment. In this work, focusing on human activity and affective expression, we study how such associations evolve within an inter-model communication pipeline that alternates between image generation and image description. Using the RAF-DB and PHASE datasets, we quantify demographic distribution drift induced by model-to-model information exchange and assess whether these drifts are systematic using an explainability pipeline. Our results reveal demographic drifts toward younger representations for both actions and emotions, as well as toward more female-presenting representations, primarily for emotions. We further find evidence that some predictions are supported by spurious visual regions (e.g., background or hair) rather than concept-relevant cues (e.g., body or face). We also examine whether these demographic drifts translate into measurable differences in downstream behaviour, i.e., while predicting activity and emotion labels. Finally, we outline mitigation strategies spanning data-centric, training and deployment interventions, and emphasise the need for careful safeguards when deploying interconnected models in human-centred AI systems.
Problem

Research questions and friction points this paper is trying to address.

associational bias
inter-model communication
generative AI
demographic drift
stereotypical associations
Innovation

Methods, ideas, or system contributions that make the work stand out.

associational bias
inter-model communication
demographic drift
explainable AI
generative models