🤖 AI Summary
The impact of the rise of large language models (LLMs) on the direction, diversity, and influence of U.S. federal research funding remains unclear. This study integrates confidential grant proposals submitted to the NSF and NIH by two R1 universities with publicly available award data, leveraging natural language processing and semantic similarity analysis to provide the first large-scale empirical examination of the relationship between LLM usage and the semantic distinctiveness of research proposals, funding success rates, and subsequent scholarly output. Findings reveal a significant and bimodally distributed increase in LLM use since 2023; higher LLM usage correlates with lower semantic distinctiveness. At NIH, it is associated with greater funding success and increased publication output—primarily in non-highly cited papers—though this effect is absent in NSF submissions, offering critical empirical evidence for AI’s role in research governance.
📝 Abstract
Federal research funding shapes the direction, diversity, and impact of the US scientific enterprise. Large language models (LLMs) are rapidly diffusing into scientific practice, holding substantial promise while raising widespread concerns. Despite growing attention to AI use in scientific writing and evaluation, little is known about how the rise of LLMs is reshaping the public funding landscape. Here, we examine LLM involvement at key stages of the federal funding pipeline by combining two complementary data sources: confidential National Science Foundation (NSF) and National Institutes of Health (NIH) proposal submissions from two large US R1 universities, including funded, unfunded, and pending proposals, and the full population of publicly released NSF and NIH awards. We find that LLM use rises sharply beginning in 2023 and exhibits a bimodal distribution, indicating a clear split between minimal and substantive use. Across both private submissions and public awards, higher LLM involvement is consistently associated with lower semantic distinctiveness, positioning projects closer to recently funded work within the same agency. The consequences of this shift are agency-dependent. LLM use is positively associated with proposal success and higher subsequent publication output at NIH, whereas no comparable associations are observed at NSF. Notably, the productivity gains at NIH are concentrated in non-hit papers rather than the most highly cited work. Together, these findings provide large-scale evidence that the rise of LLMs is reshaping how scientific ideas are positioned, selected, and translated into publicly funded research, with implications for portfolio governance, research diversity, and the long-run impact of science.