🤖 AI Summary
This study investigates how emotions systematically influence the moral judgments of large language models (LLMs), addressing a critical gap at the intersection of affective computing and machine moral reasoning. By constructing an emotion induction pipeline that injects specific emotional states into diverse moral scenarios, the authors employ multi-dataset evaluation, model capability scaling analysis, and human对照 experiments to demonstrate—for the first time—that emotions exert directional effects on LLMs’ moral acceptability judgments: positive emotions elevate and negative emotions diminish judgment scores, with binary moral conclusions reversed in approximately 20% of cases. The research further reveals that higher-capability models exhibit reduced emotional sensitivity, and certain emotions—such as regret—produce effects contrary to valence-based expectations, highlighting misalignments between model and human moral reasoning.
📝 Abstract
Large language models have been extensively studied for emotion recognition and moral reasoning as distinct capabilities, yet the extent to which emotions influence moral judgment remains underexplored. In this work, we develop an emotion-induction pipeline that infuses emotion into moral situations and evaluate shifts in moral acceptability across multiple datasets and LLMs. We observe a directional pattern: positive emotions increase moral acceptability and negative emotions decrease it, with effects strong enough to reverse binary moral judgments in up to 20% of cases, and with susceptibility scaling inversely with model capability. Our analysis further reveals that specific emotions can sometimes behave contrary to what their valence would predict (e.g., remorse paradoxically increases acceptability). A complementary human annotation study shows humans do not exhibit these systematic shifts, indicating an alignment gap in current LLMs.