Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the underlying causes of annotator disagreement in Natural Language Inference (NLI) datasets, moving beyond prior focus on within-label variation to systematically examine cross-label discrepancies in reasoning types and annotation procedures. Method: We propose a multidimensional analytical framework grounded in the LiTEx taxonomy of explanations, integrating reasoning-type consistency, explanation-text similarity, and annotator preference modeling—marking the first application of explanation categorization to cross-label variation analysis. Results: Experiments on two English NLI datasets show that reasoning-type consistency better captures semantic similarity than label consistency; moreover, deep explanatory consistency often persists beneath surface-level label disagreement, revealing annotators’ individualized reasoning strategies and decision preferences. Our findings challenge the implicit “label-as-ground-truth” assumption and offer a novel perspective for developing more robust and interpretable NLI models.

Technology Category

Application Category

📝 Abstract
Natural Language Inference datasets often exhibit human label variation. To better understand these variations, explanation-based approaches analyze the underlying reasoning behind annotators' decisions. One such approach is the LiTEx taxonomy, which categorizes free-text explanations in English into reasoning types. However, previous work applying such taxonomies has focused on within-label variation: cases where annotators agree on the final NLI label but provide different explanations. In contrast, this paper broadens the scope by examining how annotators may diverge not only in the reasoning type but also in the labeling step. We use explanations as a lens to decompose the reasoning process underlying NLI annotation and to analyze individual differences. We apply LiTEx to two NLI English datasets and align annotation variation from multiple aspects: NLI label agreement, explanation similarity, and taxonomy agreement, with an additional compounding factor of annotators' selection bias. We observe instances where annotators disagree on the label but provide highly similar explanations, suggesting that surface-level disagreement may mask underlying agreement in interpretation. Moreover, our analysis reveals individual preferences in explanation strategies and label choices. These findings highlight that agreement in reasoning types better reflects the semantic similarity of free-text explanations than label agreement alone. Our findings underscore the richness of reasoning-based explanations and the need for caution in treating labels as ground truth.
Problem

Research questions and friction points this paper is trying to address.

Analyzing human label variation in NLI through explanation decomposition
Examining annotator divergence in both reasoning types and labeling decisions
Investigating how explanation similarity reveals hidden agreement beyond labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LiTEx taxonomy to analyze explanation variations
Aligns annotation differences across labels and explanations
Reveals semantic similarity beyond surface label disagreement
🔎 Similar Papers
No similar papers found.