🤖 AI Summary
Existing zero-shot stance detection (ZSSD) methods suffer from weak generalization, poor target–text alignment, and limited interpretability: they over-rely on explicit reasoning, lack fine-grained explanations, and fail to explicitly model the reasoning process. This paper proposes IRIS, the first framework to formulate stance detection as an information retrieval ranking task. IRIS jointly models implicit reasoning paths and explicit linguistic features grounded in affective-cognitive dimensions—without requiring reasoning annotations—enabling fine-grained, interpretable predictions. Leveraging large language models, IRIS co-optimizes intrinsic sequential structure and semantic plausibility, significantly enhancing transparency and cross-target generalization. On benchmarks including VAST and EZ-STANCE, IRIS achieves superior performance using only 10%–50% of training data compared to state-of-the-art methods, empirically validating its robust generalization capability and explanatory advantage.
📝 Abstract
Zero-Shot Stance Detection (ZSSD) identifies the attitude of the post toward unseen targets. Existing research using contrastive, meta-learning, or data augmentation suffers from generalizability issues or lack of coherence between text and target. Recent works leveraging large language models (LLMs) for ZSSD focus either on improving unseen target-specific knowledge or generating explanations for stance analysis. However, most of these works are limited by their over-reliance on explicit reasoning, provide coarse explanations that lack nuance, and do not explicitly model the reasoning process, making it difficult to interpret the model's predictions. To address these issues, in our study, we develop a novel interpretable ZSSD framework, IRIS. We provide an interpretable understanding of the attitude of the input towards the target implicitly based on sequences within the text (implicit rationales) and explicitly based on linguistic measures (explicit rationales). IRIS considers stance detection as an information retrieval ranking task, understanding the relevance of implicit rationales for different stances to guide the model towards correct predictions without requiring the ground-truth of rationales, thus providing inherent interpretability. In addition, explicit rationales based on communicative features help decode the emotional and cognitive dimensions of stance, offering an interpretable understanding of the author's attitude towards the given target. Extensive experiments on the benchmark datasets of VAST, EZ-STANCE, P-Stance, and RFD using 50%, 30%, and even 10% training data prove the generalizability of our model, benefiting from the proposed architecture and interpretable design.