🤖 AI Summary
This paper identifies a systemic output variance compression phenomenon—“regression to the mean”—in generative AI (e.g., ChatGPT), which inherently reduces response diversity and suppresses extreme viewpoints or solutions, constituting a fundamental statistical bias rather than an incidental flaw. We propose a three-tier (socio–group–individual) and two-dimensional (material–immaterial) analytical framework, integrating statistical analysis, phenomenological modeling, and empirical distribution comparison. Our approach aligns theoretical modeling with empirical evaluation without relying on specific model architectures. Quantitative results confirm that variance reduction induces multiple negative externalities: innovation suppression, consensus polarization, and narrowing of individual expressive range. Based on these findings, we introduce a co-design intervention strategy combining server-side sampling optimization and user-side prompt engineering, establishing both theoretical foundations and practical paradigms for governing AI-generated diversity.
📝 Abstract
Generative AI models, such as ChatGPT, will increasingly replace humans in producing output for a variety of important tasks. While much prior work has mostly focused on the improvement in the average performance of generative AI models relative to humans' performance, much less attention has been paid to the significant reduction of variance in output produced by generative AI models. In this Perspective, we demonstrate that generative AI models are inherently prone to the phenomenon of"regression toward the mean"whereby variance in output tends to shrink relative to that in real-world distributions. We discuss potential social implications of this phenomenon across three levels-societal, group, and individual-and two dimensions-material and non-material. Finally, we discuss interventions to mitigate negative effects, considering the roles of both service providers and users. Overall, this Perspective aims to raise awareness of the importance of output variance in generative AI and to foster collaborative efforts to meet the challenges posed by the reduction of variance in output generated by AI models.