Code Contribution and Credit in Science

📅 2025-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Scientific software development remains systematically under-recognized in academic credit allocation, creating a misalignment between code contributions and traditional authorship. Method: Leveraging ~140,000 paper–code repository pairings, we constructed a cross-platform author–developer linkage prediction model and conducted multivariate statistical analyses controlling for confounding factors. Contribution/Results: We find that nearly 30% of papers have code contributors who are not listed as authors; these contributors are rarely acknowledged, and their contributions yield no significant increase in paper citations. Crucially, researchers with high coding activity exhibit significantly lower h-indices. These findings demonstrate that current scholarly evaluation systems substantially undervalue software contributions, resulting in a misalignment between incentive structures and research practice. This study provides the first large-scale empirical evidence of a “recognition deficit” for software labor, offering critical support for reforming research assessment frameworks.

Technology Category

Application Category

📝 Abstract
Software development has become essential to scientific research, but its relationship to traditional metrics of scholarly credit remains poorly understood. We develop a dataset of approximately 140,000 paired research articles and code repositories, as well as a predictive model that matches research article authors with software repository developer accounts. We use this data to investigate how software development activities influence credit allocation in collaborative scientific settings. Our findings reveal significant patterns distinguishing software contributions from traditional authorship credit. We find that nearly 30% of articles include non-author code contributors- individuals who participated in software development but received no formal authorship recognition. While code-contributing authors show a modest $sim$4.2% increase in article citations, this effect becomes non-significant when controlling for domain, article type, and open access status. First authors are significantly more likely to be code contributors than other author positions. Notably, we identify a negative relationship between coding frequency and scholarly impact metrics. Authors who contribute code more frequently exhibit progressively lower h-indices than non-coding colleagues, even when controlling for publication count, author position, domain, and article type. These results suggest a disconnect between software contributions and credit, highlighting important implications for institutional reward structures and science policy.
Problem

Research questions and friction points this paper is trying to address.

Investigating how software development affects credit allocation in scientific collaboration
Analyzing the relationship between code contributions and traditional authorship recognition
Examining the disconnect between software contributions and scholarly impact metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataset pairs articles with code repositories
Predictive model matches authors to developers
Analyzes software impact on credit allocation
🔎 Similar Papers
No similar papers found.