TriCon-Fair: Triplet Contrastive Learning for Mitigating Social Bias in Pre-trained Language Models

📅 2025-11-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) often amplify societal biases, and existing debiasing methods independently model biased and unbiased samples, overlooking their implicit negative coupling—leading to inter-group performance trade-offs and residual bias. To address this, we propose a triplet-based contrastive learning framework for disentangled debiasing. Our method introduces a disentanglement loss that explicitly decouples representational associations between biased and unbiased samples, and jointly optimizes contrastive learning with the language modeling objective to prevent interference from positive/negative sample misalignment. This end-to-end approach significantly reduces discriminatory outputs across multiple bias evaluation benchmarks—achieving an average 23.6% improvement in fairness metrics—while preserving or even enhancing downstream task performance, outperforming state-of-the-art debiasing methods. The core contribution lies in explicitly modeling and breaking the adverse coupling between biased and unbiased samples, thereby enabling synergistic optimization of fairness and linguistic capability.

Technology Category

Application Category

📝 Abstract
The increasing utilization of large language models raises significant concerns about the propagation of social biases, which may result in harmful and unfair outcomes. However, existing debiasing methods treat the biased and unbiased samples independently, thus ignoring their mutual relationship. This oversight enables a hidden negative-positive coupling, where improvements for one group inadvertently compromise the other, allowing residual social bias to persist. In this paper, we introduce TriCon-Fair, a contrastive learning framework that employs a decoupled loss that combines triplet and language modeling terms to eliminate positive-negative coupling. Our TriCon-Fair assigns each anchor an explicitly biased negative and an unbiased positive, decoupling the push-pull dynamics and avoiding positive-negative coupling, and jointly optimizes a language modeling (LM) objective to preserve general capability. Experimental results demonstrate that TriCon-Fair reduces discriminatory output beyond existing debiasing baselines while maintaining strong downstream performance. This suggests that our proposed TriCon-Fair offers a practical and ethical solution for sensitive NLP applications.
Problem

Research questions and friction points this paper is trying to address.

Mitigating social bias in pre-trained language models
Addressing negative-positive coupling in debiasing methods
Reducing discriminatory outputs while preserving model performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses triplet contrastive learning for debiasing
Decouples push-pull dynamics with biased negatives
Jointly optimizes language modeling and debiasing objectives
C
Chong Lyu
School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, China
L
Lin Li
School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, China
Shiqing Wu
Shiqing Wu
Assistant Professor, City University of Macau
Multi-agent systemsRecommender systemsSocial influenceReinforcement Learning
J
Jingling Yuan
School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, China