🤖 AI Summary
Large language models (LLMs) suffer from spurious correlations—particularly between sentiment and stance—as well as topic- or entity-specific biases in stance detection, leading to biased predictions and poor generalization. To address this, we propose FACTUAL, a Counterfactual-Augmented Fairness and Calibration Network, which introduces the first counterfactual data augmentation framework for bias disentanglement: it explicitly decouples bias representations from genuine stance semantics by generating counterfactual samples, and incorporates a bias-aware calibration module jointly optimized for both zero-shot and in-domain stance detection. Evaluated across multiple benchmarks, FACTUAL achieves state-of-the-art accuracy while significantly reducing bias metrics (e.g., Bias Score ↓32.7%). Crucially, our results provide the first empirical evidence of a strong positive correlation between bias mitigation and performance improvement.
📝 Abstract
Stance detection is critical for understanding the underlying position or attitude expressed toward a topic. Large language models (LLMs) have demonstrated significant advancements across various natural language processing tasks including stance detection, however, their performance in stance detection is limited by biases and spurious correlations inherent due to their data-driven nature. Our statistical experiment reveals that LLMs are prone to generate biased stances due to sentiment-stance spurious correlations and preference towards certain individuals and topics. Furthermore, the results demonstrate a strong negative correlation between stance bias and stance detection performance, underscoring the importance of mitigating bias to enhance the utility of LLMs in stance detection. Therefore, in this paper, we propose a Counterfactual Augmented Calibration Network (FACTUAL), which a novel calibration network is devised to calibrate potential bias in the stance prediction of LLMs. Further, to address the challenge of effectively learning bias representations and the difficulty in the generalizability of debiasing, we construct counterfactual augmented data. This approach enhances the calibration network, facilitating the debiasing and out-of-domain generalization. Experimental results on in-target and zero-shot stance detection tasks show that the proposed FACTUAL can effectively mitigate biases of LLMs, achieving state-of-the-art results.