PERSPECTRA: A Scalable and Configurable Pluralist Benchmark of Perspectives from Arguments

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the challenge that current large language models struggle to effectively model the heterogeneity of human perspectives in alignment research. To this end, it introduces the first scalable and configurable benchmark for diverse viewpoints by integrating Kialo’s structured debate graphs with the linguistic diversity of Reddit discussions. Through a controlled retrieval-and-augmentation pipeline, the authors generate 3,810 arguments across 100 contentious topics and design three evaluation tasks: stance counting, stance matching, and polarity judgment. Experimental results reveal systematic deficiencies in both mainstream open- and closed-source models in recognizing and reasoning about pluralistic viewpoints—such as overestimating the number of distinct opinions and misinterpreting concession structures—thereby underscoring the benchmark’s critical role in advancing the evaluation of models’ capacity for nuanced, multi-perspective understanding.

Technology Category

Application Category

📝 Abstract

Pluralism, the capacity to engage with diverse perspectives without collapsing them into a single viewpoint, is critical for developing large language models that faithfully reflect human heterogeneity. Yet this characteristic has not been carefully examined in the LLM research community and remains absent from most alignment studies. Debate-oriented sources provide a natural entry point for pluralism research. Previous work builds on online debate sources but remains constrained by costly human validation. Other debate-rich platforms such as Reddit and Kialo also offer promising material: Reddit provides linguistic diversity and scale but lacks clear argumentative structure, while Kialo supplies explicit pro/con graphs but remains overly concise and detached from natural discourse. We introduce PERSPECTRA, a pluralist benchmark that integrates the structural clarity of Kialo debate graphs with the linguistic diversity of real Reddit discussions. Using a controlled retrieval-and-expansion pipeline, we construct 3,810 enriched arguments spanning 762 pro/con stances on 100 controversial topics. Each opinion is expanded to multiple naturalistic variants, enabling robust evaluation of pluralism. We initialise three tasks with PERSPECTRA: opinion counting (identifying distinct viewpoints), opinion matching (aligning supporting stances and discourse to source opinions), and polarity check (inferring aggregate stance in mixed discourse). Experiments with state-of-the-art open-source and proprietary LLMs, highlight systematic failures, such as overestimating the number of viewpoints and misclassifying concessive structures, underscoring the difficulty of pluralism-aware understanding and reasoning. By combining diversity with structure, PERSPECTRA establishes the first scalable, configurable benchmark for evaluating how well models represent, distinguish, and reason over multiple perspectives.

Problem

Research questions and friction points this paper is trying to address.

pluralism

perspective diversity

large language models

argument understanding

benchmark evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

pluralism

argument structure

benchmark construction