🤖 AI Summary
This work addresses the tendency of large language models to overlook embedded commonsense contradictions in moral reasoning. To this end, the authors introduce CoMoral, the first benchmark that integrates moral dilemmas with explicit commonsense conflicts, enabling a systematic evaluation of commonsense awareness across ten prominent models through carefully crafted scenarios. The study identifies and formally defines a novel phenomenon termed “narrative focus bias”: models are more likely to detect commonsense violations committed by secondary characters than those made by the narrator themselves. Experimental results demonstrate that current models generally fail to recognize commonsense inconsistencies within moral contexts without explicit prompting, and their performance is significantly influenced by the narrative perspective. These findings underscore the urgent need to enhance the robustness of commonsense reasoning in language models operating in ethically sensitive domains.
📝 Abstract
Large Language Models (LLMs) are increasingly deployed across diverse real-world applications and user communities. As such, it is crucial that these models remain both morally grounded and knowledge-aware. In this work, we uncover a critical limitation of current LLMs -- their tendency to prioritize moral reasoning over commonsense understanding. To investigate this phenomenon, we introduce CoMoral, a novel benchmark dataset containing commonsense contradictions embedded within moral dilemmas. Through extensive evaluation of ten LLMs across different model sizes, we find that existing models consistently struggle to identify such contradictions without prior signal. Furthermore, we observe a pervasive narrative focus bias, wherein LLMs more readily detect commonsense contradictions when they are attributed to a secondary character rather than the primary (narrator) character. Our comprehensive analysis underscores the need for enhanced reasoning-aware training to improve the commonsense robustness of large language models.