🤖 AI Summary
This study addresses the long-standing neglect of negative results in scientific publishing and its detrimental impact on large language models (LLMs), which inherit and amplify this positivity bias during training, research assistance, and peer review, thereby compromising their reliability and data sustainability. For the first time, this work systematically articulates the renewed value of negative results in the LLM era. Through theoretical analysis, modeling of publication bias, and a behavioral evaluation framework, it demonstrates how the absence of negative findings undermines LLM performance across multiple scientific roles. The research directly links publication bias to model capability degradation and proposes both empirical validation protocols and structural reform pathways, laying a theoretical and practical foundation for a new scholarly publishing paradigm that meaningfully incorporates failure data.
📝 Abstract
Scientific publishing systematically filters out negative results. We argue that this long-standing asymmetry has become an urgent problem in the era of large language models, which inherit the positive bias of the literature they are trained on, face an impending shortage of high-quality training data, and are increasingly deployed as both research tools and peer reviewers. We analyze three ways in which LLMs have changed the value of failure data and show that the systematic absence of such data degrades their utility as research tools, training data consumers, and peer reviewers alike. We outline experimental protocols to validate these claims and discuss the structural conditions under which a failure-inclusive publishing culture could emerge.