Not All Subjectivity Is the Same! Defining Desiderata for the Evaluation of Subjectivity in NLP

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current evaluation practices for NLP models commonly overlook the diversity of subjective perspectives, particularly neglecting viewpoints from minority groups, thereby failing to accurately capture the real-world impact of these models on users. This work presents the first systematic distinction between ambiguous and polyphonic inputs and proposes seven evaluation criteria tailored for subjectivity-sensitive models. Developed through a top-down approach, these criteria integrate manifestations of subjectivity in both data and model behavior with user-centered impact considerations. Drawing on a systematic literature review and qualitative analysis of 60 relevant studies, the research identifies critical gaps in existing evaluation frameworks and pinpoints key blind spots in the modeling and assessment of subjectivity. The findings offer both a theoretical foundation and practical guidance for building NLP systems that are more equitable, inclusive, and aligned with diverse user needs.
📝 Abstract
Subjective judgments are part of several NLP datasets and recent work is increasingly prioritizing models whose outputs reflect this diversity of perspectives. Such responses allow us to shed light on minority voices, which are frequently marginalized or obscured by dominant perspectives. It remains a question whether our evaluation practices align with these models' objectives. This position paper proposes seven evaluation desiderata for subjectivity-sensitive models, rooted in how subjectivity is represented in NLP data and models. The desiderata are constructed in a top-down approach, keeping in mind the user-centric impact of such models. We scan the experimental setup of 60 papers and show that various aspects of subjectivity are still understudied: the distinction between ambiguous and polyphonic input, whether subjectivity is effectively expressed to the user, and a lack of interplay between different desiderata, amongst other gaps.
Problem

Research questions and friction points this paper is trying to address.

subjectivity
evaluation
NLP
polyphony
minority voices
Innovation

Methods, ideas, or system contributions that make the work stand out.

subjectivity evaluation
evaluation desiderata
polyphonic input
user-centric NLP
minority perspectives
🔎 Similar Papers
No similar papers found.