ViLBias: A Comprehensive Framework for Bias Detection through Linguistic and Visual Cues , presenting Annotation Strategies, Evaluation, and Key Challenges

📅 2024-12-22
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of detecting implicit bias in news—such as linguistic framing bias and image-text inconsistency—by proposing ViLBias, a multimodal bias detection framework. Methodologically, it jointly leverages textual and visual cues through coordinated invocation of large language models (LLMs), vision-language models (VLMs), and small language models (SLMs), and introduces a novel hybrid annotation paradigm combining LLM-assisted labeling with human verification. Key contributions include: (1) the first systematic evaluation of SLMs, LLMs, and VLMs for bimodal bias detection, demonstrating LLMs’ superior fine-grained recognition capability over SLMs; (2) empirical validation that image-text joint modeling improves detection accuracy by 3–5%; and (3) release of the first benchmark dataset for multimodal news bias, covering diverse news sources and featuring fine-grained, human-verified annotations. These results provide both a novel methodology and reproducible resources for multimodal bias assessment.

Technology Category

Application Category

📝 Abstract
The integration of Large Language Models (LLMs) and Vision-Language Models (VLMs) opens new avenues for addressing complex challenges in multimodal content analysis, particularly in biased news detection. This study introduces VLBias, a framework that leverages state-of-the-art LLMs and VLMs to detect linguistic and visual biases in news content. We present a multimodal dataset comprising textual content and corresponding images from diverse news sources. We propose a hybrid annotation framework that combines LLM-based annotations with human review to ensure high-quality labeling while reducing costs and enhancing scalability. Our evaluation compares the performance of state-of-the-art SLMs and LLMs for both modalities (text and images) and the results reveal that while SLMs are computationally efficient, LLMs demonstrate superior accuracy in identifying subtle framing and text-visual inconsistencies. Furthermore, empirical analysis shows that incorporating visual cues alongside textual data improves bias detection accuracy by 3 to 5%. This study provides a comprehensive exploration of LLMs, SLMs, and VLMs as tools for detecting multimodal biases in news content and highlights their respective strengths, limitations, and potential for future applications
Problem

Research questions and friction points this paper is trying to address.

Bias Detection
News Content Analysis
Efficiency Improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

ViLBias
Multimodal Bias Detection
Integrated Language and Vision Models
🔎 Similar Papers
No similar papers found.
S
Shaina Raza
Vector Institute, Toronto, M5G 1M1, ON, Canada
C
Caesar Saleh
Vector Institute, Toronto, M5G 1M1, ON, Canada
Emrul Hasan
Emrul Hasan
Toronto Metropolitan University & Vector Institute
Recommender SystemLLMsDeep LearningNLPand Responsible AI.
F
Franklin Ogidi
Vector Institute, Toronto, M5G 1M1, ON, Canada
M
Maximus Powers
Vector Institute, Toronto, M5G 1M1, ON, Canada
Veronica Chatrath
Veronica Chatrath
Technical Program Manager | Vector Institute
Marcelo Lotif
Marcelo Lotif
Senior Software Developer, Vector Institute
Machine LearningArtificial Intelligence
R
Roya Javadi
Vector Institute, Toronto, M5G 1M1, ON, Canada
A
Anam Zahid
Vector Institute, Toronto, M5G 1M1, ON, Canada
Vahid Reza Khazaie
Vahid Reza Khazaie
Vector Institute
Deep Learning