Naturalistic measure of social norms alignment

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This study addresses the challenge of effectively evaluating alignment with social norms in natural, open-ended scenarios, a task where existing approaches often rely on constrained questionnaires or predefined statements. To overcome this limitation, the authors propose a novel solution-matching framework that assesses normative consistency between any two agents—such as humans and large language models—through free-form dialogue responses to social dilemmas. They introduce the first dataset of 3,000 non-trivial Danish-language social dilemmas and define two new metrics: statement consistency and explicit agreement accuracy, supported by expert annotations and an open-ended interaction design. Experimental results demonstrate that the framework reliably ranks different large language models and reveals high alignment with human judgments on issues like neighborhood conflicts, thereby validating both its efficacy and cultural sensitivity.

📝 Abstract

Social norms reflect shared expectations on acceptable behavior. Measuring social norms alignment remains challenging, with existing approaches typically relying on artificial closed-form evaluations such as multiple-choice questionnaires or measuring agreement with predefined statements. In the context of this work, social norms alignment refers to measuring an agreement between solutions with respect to the social problem or dilemma. We propose a framework for measuring social norm alignment in naturalistic, free-form settings through solution matching. The framework enables us to measure alignment between any two dilemma responses e.g., LLMs to a human, LLMs to LLMs, or human to human. We introduce two metrics: stated and explicit agreement accuracy, and construct a dataset of 3k non-trivial social dilemmas in Danish. All dilemmas are assigned reference solutions derived from three panelists, who serve as culturally grounded judges. We evaluate the agreement of several LLMs and human responses in an interaction setup that resembles natural user-model conversations. Our results show that the proposed metrics produce consistent model rankings and reveal variation in agreement across different types of dilemmas, with higher agreement observed for topics such as neighbor conflicts and shared living situations. Overall, our work introduces a dataset and evaluation framework for studying culturally grounded social reasoning in naturalistic open-ended conversations.

Problem

Research questions and friction points this paper is trying to address.

social norms alignment

naturalistic evaluation

social dilemmas

solution matching

culturally grounded reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

social norms alignment

solution matching

naturalistic evaluation