Labels have Human Values: Value Calibration of Subjective Tasks

📅 2026-01-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the challenge that annotations in subjective tasks inherently reflect diverse human values, which conventional models overlook, often yielding predictions misaligned with specific value groups. To bridge this gap, the authors propose MC-STL, a novel framework that explicitly models implicit value structures for the first time. By clustering annotators based on rationale similarity, expert-defined value categories, or sociocultural attributes, the method identifies distinct value clusters and learns cluster-specific embeddings to enable value-calibrated predictions. Integrating clustering analysis, value-oriented embeddings, and multi-task learning, MC-STL supports various subjective task formulations—including ordinal, binary, and preference-based settings. Experiments on datasets involving toxic dialogue, offensive language detection, and human preference alignment demonstrate consistent and significant improvements over baselines that ignore value heterogeneity, particularly in discriminative performance, value-specific calibration, and disagreement-aware metrics.

Technology Category

Application Category

📝 Abstract

Building NLP systems for subjective tasks requires one to ensure their alignment to contrasting human values. We propose the MultiCalibrated Subjective Task Learner framework (MC-STL), which clusters annotations into identifiable human value clusters by three approaches (similarity of annotator rationales, expert-value taxonomies or rater's sociocultural descriptors) and calibrates predictions for each value cluster by learning cluster-specific embeddings. We demonstrate MC-STL on several subjective learning settings, including ordinal, binary, and preference learning predictions, and evaluate it on multiple datasets covering toxic chatbot conversations, offensive social media posts, and human preference alignment. The results show that MC-STL consistently outperforms the baselines that ignore the latent value structure of the annotations, delivering gains in discrimination, value-specific calibration, and disagreement-aware metrics.

Problem

Research questions and friction points this paper is trying to address.

human values

subjective tasks

value alignment

annotation disagreement

NLP

Innovation

Methods, ideas, or system contributions that make the work stand out.

value calibration

subjective NLP tasks

human value clustering