🤖 AI Summary
Current AI research narrowly equates “rigor” with methodological correctness—e.g., mathematical or statistical soundness—leading to inflated capability claims, weak empirical evidence, and obscured normative commitments. To address this, we propose a six-dimensional framework of *generalized rigor*, encompassing methodological, epistemological, normative, conceptual, reporting, and interpretive dimensions. This is the first systematic integration of responsible AI principles with scholarly quality assessment criteria. Through conceptual analysis, interdisciplinary theoretical synthesis (drawing from philosophy of science, STS, and AI ethics), and normative framework design, we reconstruct academic discourse around AI research quality. The framework provides researchers, policymakers, and science communicators with an actionable language and dialogic toolset for evaluating rigor—not merely as technical correctness, but as critical reflection on capability claims, evidential strength, and value embedding. It thereby enables a foundational paradigm shift toward responsible AI research.
📝 Abstract
In AI research and practice, rigor remains largely understood in terms of methodological rigor -- such as whether mathematical, statistical, or computational methods are correctly applied. We argue that this narrow conception of rigor has contributed to the concerns raised by the responsible AI community, including overblown claims about AI capabilities. Our position is that a broader conception of what rigorous AI research and practice should entail is needed. We believe such a conception -- in addition to a more expansive understanding of (1) methodological rigor -- should include aspects related to (2) what background knowledge informs what to work on (epistemic rigor); (3) how disciplinary, community, or personal norms, standards, or beliefs influence the work (normative rigor); (4) how clearly articulated the theoretical constructs under use are (conceptual rigor); (5) what is reported and how (reporting rigor); and (6) how well-supported the inferences from existing evidence are (interpretative rigor). In doing so, we also aim to provide useful language and a framework for much-needed dialogue about the AI community's work by researchers, policymakers, journalists, and other stakeholders.