Aligning Visual Contrastive learning models via Preference Optimization

📅 2024-11-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses critical limitations in vision-language contrastive learning—vulnerability to typographic attacks, gender bias, and misalignment with human preferences. We propose the first integration of preference optimization (PO) into vision-language contrastive learning, jointly incorporating adversarial robustness training and sensitive-attribute disentanglement via intervention. Our method enables fine-grained semantic disentanglement and controllable alignment with sensitive attributes (e.g., gender). Experiments demonstrate substantial improvements over standard contrastive baselines across multitask evaluations: enhanced robustness against typographic attacks, a 37.2% reduction in Bias Score (indicating significantly mitigated gender bias), and maintained downstream task accuracy. The core contribution is the novel application of PO to vision-language contrastive learning, unifying improvements in model robustness, fairness, and generalization—thereby advancing the state of the art in aligned, reliable, and equitable multimodal representation learning.

Technology Category

Application Category

📝 Abstract
Contrastive learning models have demonstrated impressive abilities to capture semantic similarities by aligning representations in the embedding space. However, their performance can be limited by the quality of the training data and its inherent biases. While Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) have been applied to generative models to align them with human preferences, their use in contrastive learning has yet to be explored. This paper introduces a novel method for training contrastive learning models using Preference Optimization (PO) to break down complex concepts. Our method systematically aligns model behavior with desired preferences, enhancing performance on the targeted task. In particular, we focus on enhancing model robustness against typographic attacks, commonly seen in contrastive models like CLIP. We further apply our method to disentangle gender understanding and mitigate gender biases, offering a more nuanced control over these sensitive attributes. Our experiments demonstrate that models trained using PO outperform standard contrastive learning techniques while retaining their ability to handle adversarial challenges and maintain accuracy on other downstream tasks. This makes our method well-suited for tasks requiring fairness, robustness, and alignment with specific preferences. We evaluate our method on several vision-language tasks, tackling challenges such as typographic attacks. Additionally, we explore the model's ability to disentangle gender concepts and mitigate gender bias, showcasing the versatility of our approach.
Problem

Research questions and friction points this paper is trying to address.

Model Adjustment
Bias Reduction
Performance Enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Preference Optimization
Bias Reduction
Performance Enhancement
🔎 Similar Papers
No similar papers found.
A
Amirabbas Afzali
B
Borna Khodabandeh
Ali Rasekh
Ali Rasekh
L3S Research Center
M
Mahyar JafariNodeh
Massachusetts Institute of Technology, USA
S
Sepehr kazemi
Simon Gottschalk
Simon Gottschalk
L3S Research Center
Knowledge GraphsEventsSemantic AnalyticsMobility