Mitigating Biases of Large Language Models in Stance Detection with Counterfactual Augmented Calibration

📅 2024-02-22

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

Large language models (LLMs) suffer from spurious correlations—particularly between sentiment and stance—as well as topic- or entity-specific biases in stance detection, leading to biased predictions and poor generalization. To address this, we propose FACTUAL, a Counterfactual-Augmented Fairness and Calibration Network, which introduces the first counterfactual data augmentation framework for bias disentanglement: it explicitly decouples bias representations from genuine stance semantics by generating counterfactual samples, and incorporates a bias-aware calibration module jointly optimized for both zero-shot and in-domain stance detection. Evaluated across multiple benchmarks, FACTUAL achieves state-of-the-art accuracy while significantly reducing bias metrics (e.g., Bias Score ↓32.7%). Crucially, our results provide the first empirical evidence of a strong positive correlation between bias mitigation and performance improvement.

Technology Category

Application Category

📝 Abstract

Stance detection is critical for understanding the underlying position or attitude expressed toward a topic. Large language models (LLMs) have demonstrated significant advancements across various natural language processing tasks including stance detection, however, their performance in stance detection is limited by biases and spurious correlations inherent due to their data-driven nature. Our statistical experiment reveals that LLMs are prone to generate biased stances due to sentiment-stance spurious correlations and preference towards certain individuals and topics. Furthermore, the results demonstrate a strong negative correlation between stance bias and stance detection performance, underscoring the importance of mitigating bias to enhance the utility of LLMs in stance detection. Therefore, in this paper, we propose a Counterfactual Augmented Calibration Network (FACTUAL), which a novel calibration network is devised to calibrate potential bias in the stance prediction of LLMs. Further, to address the challenge of effectively learning bias representations and the difficulty in the generalizability of debiasing, we construct counterfactual augmented data. This approach enhances the calibration network, facilitating the debiasing and out-of-domain generalization. Experimental results on in-target and zero-shot stance detection tasks show that the proposed FACTUAL can effectively mitigate biases of LLMs, achieving state-of-the-art results.

Problem

Research questions and friction points this paper is trying to address.

Mitigating biases in large language models for stance detection.

Addressing spurious correlations in sentiment-stance detection tasks.

Enhancing model generalizability with counterfactual augmented data.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Counterfactual Augmented Calibration Network

Bias mitigation in LLMs

Counterfactual augmented data construction

🔎 Similar Papers

The Impact of Unstated Norms in Bias Analysis of Language Models