π€ AI Summary
This paper addresses the problem of quantitatively predicting human suffering intensity from natural-language scene descriptions using large language models (LLMs), formulated as a fine-grained regression task over a 0β100 scale. Methodologically, it introduces three prompting strategies: zero-shot prompting; retrieval-augmented few-shot prompting leveraging BERT embeddings; and a novel composite task design integrating ordinal comparison, binary classification, and scalar estimation. A key contribution is the βPain Game Showββa gamified, multimodal, feedback-driven psychological assessment framework enabling iterative, context-aware evaluation of affective perception. This framework significantly enhances LLMsβ dynamic emotional understanding and adaptive reasoning capabilities. Experimental results demonstrate that few-shot prompting substantially outperforms zero-shot baselines, validating the feasibility and promise of LLMs for embodied, situationally grounded suffering modeling.
π Abstract
This study investigates the use of Large Language Models (LLMs) for predicting human-perceived misery scores from natural language descriptions of real-world scenarios. The task is framed as a regression problem, where the model assigns a scalar value from 0 to 100 to each input statement. We evaluate multiple prompting strategies, including zero-shot, fixed-context few-shot, and retrieval-based prompting using BERT sentence embeddings. Few-shot approaches consistently outperform zero-shot baselines, underscoring the value of contextual examples in affective prediction. To move beyond static evaluation, we introduce the "Misery Game Show", a novel gamified framework inspired by a television format. It tests LLMs through structured rounds involving ordinal comparison, binary classification, scalar estimation, and feedback-driven reasoning. This setup enables us to assess not only predictive accuracy but also the model's ability to adapt based on corrective feedback. The gamified evaluation highlights the broader potential of LLMs in dynamic emotional reasoning tasks beyond standard regression. Code and data link: https://github.com/abhi1nandy2/Misery_Data_Exps_GitHub