🤖 AI Summary
The reproducibility crisis in scientific research stems partly from insufficient reporting transparency and inconsistent adherence to standardized practices. This study systematically evaluates 11 automated tools against ScreenIT’s nine rigor criteria—including open data availability, explicit inclusion/exclusion criteria disclosure, and preregistration—assessing their capacity to detect compliance. Results reveal that individual tools exhibit limited sensitivity, whereas ensemble approaches significantly improve overall detection rates; notably, one tool demonstrates superior performance specifically in identifying open data statements. This work presents the first cross-platform empirical validation of tool collaboration as an effective strategy for rigor assessment. Based on these findings, we propose concrete, evidence-based directions for tool development and refinement. All code and datasets are publicly released, providing a fully reproducible methodology and practical guidance to advance research transparency and reproducibility. (149 words)
📝 Abstract
The causes of the reproducibility crisis include lack of standardization and transparency in scientific reporting. Checklists such as ARRIVE and CONSORT seek to improve transparency, but they are not always followed by authors and peer review often fails to identify missing items. To address these issues, there are several automated tools that have been designed to check different rigor criteria. We have conducted a broad comparison of 11 automated tools across 9 different rigor criteria from the ScreenIT group. We found some criteria, including detecting open data, where the combination of tools showed a clear winner, a tool which performed much better than other tools. In other cases, including detection of inclusion and exclusion criteria, the combination of tools exceeded the performance of any one tool. We also identified key areas where tool developers should focus their effort to make their tool maximally useful. We conclude with a set of insights and recommendations for stakeholders in the development of rigor and transparency detection tools. The code and data for the study is available at https://github.com/PeterEckmann1/tool-comparison.