🤖 AI Summary
This work addresses the tension between universality and personalization in current large language model alignment approaches, which either overlook norms of minority groups or incur prohibitive costs. To bridge this gap, the paper proposes “community-level alignment” as a novel intermediate paradigm and introduces CommunityBench—the first large-scale benchmark for evaluating models’ capacity to capture community-specific preferences. Grounded in sociological theories of shared identity and social bonding, CommunityBench comprises four carefully designed tasks that systematically assess how well models represent diverse community values. Experiments reveal that prevailing models exhibit limited ability to model such community-specific preferences. Crucially, community-level alignment not only effectively captures group diversity but also enhances individual-level alignment, offering a scalable and inclusive pathway toward value-aligned AI systems.
📝 Abstract
Large language models (LLMs) alignment ensures model behaviors reflect human value. Existing alignment strategies primarily follow two paths: one assumes a universal value set for a unified goal (i.e., one-size-fits-all), while the other treats every individual as unique to customize models (i.e., individual-level). However, assuming a monolithic value space marginalizes minority norms, while tailoring individual models is prohibitively expensive. Recognizing that human society is organized into social clusters with high intra-group value alignment, we propose community-level alignment as a"middle ground". Practically, we introduce CommunityBench, the first large-scale benchmark for community-level alignment evaluation, featuring four tasks grounded in Common Identity and Common Bond theory. With CommunityBench, we conduct a comprehensive evaluation of various foundation models on CommunityBench, revealing that current LLMs exhibit limited capacity to model community-specific preferences. Furthermore, we investigate the potential of community-level alignment in facilitating individual modeling, providing a promising direction for scalable and pluralistic alignment.