🤖 AI Summary
This study addresses the challenges of identifying Schwartz values in political texts, where contextual cues are often implicit and adjacent values are easily confounded. It presents the first systematic evaluation of how context length (sentence-, window-, or document-level), integration of external moral knowledge via retrieval augmentation, and model scale—ranging from DeBERTa-base/large to zero-shot large language models (12B–123B parameters)—affect recognition performance. Results show that full-document context improves DeBERTa’s macro F1 by 3.8–4.8 points; retrieval-augmented moral knowledge consistently yields gains across all settings; increasing model scale does not uniformly enhance performance; and early fusion of external knowledge significantly outperforms late fusion, particularly for social-oriented or easily confusable values.
📝 Abstract
Detecting Schwartz values in political text is difficult because implicit cues often depend on surrounding arguments and fine-grained distinctions between neighboring values. We study when context and explicit moral knowledge help sentence-level value detection. Using the ValuesML/Touch{é} ValueEval format, we compare sentence, window, and full-document inputs; no-RAG and retrieval-augmented settings with a curated moral knowledge base; supervised DeBERTa-v3-base/large encoders; and zero-shot LLMs from 12B to 123B parameters. The results show that more context is not uniformly better: full-document context improves supervised DeBERTa encoders by 3.8--4.8 macro-F1 points over sentence-only input, but does not consistently help zero-shot LLMs. Retrieved moral knowledge is more consistently useful in matched comparisons, improving each tested model family and context condition under early fusion. However, scaling from DeBERTa-v3-base to large and from 12B to larger LLMs does not guarantee gains, and simple early fusion outperforms the tested late-fusion and cross-attention RAG variants for encoders. Per-value analyses show that context and retrieval help most for socially situated or conceptually confusable values. These findings suggest that value-sensitive NLP should evaluate context, knowledge, and model family jointly rather than treating longer inputs or larger models as universal improvements.