The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The impact of the rise of large language models (LLMs) on the direction, diversity, and influence of U.S. federal research funding remains unclear. This study integrates confidential grant proposals submitted to the NSF and NIH by two R1 universities with publicly available award data, leveraging natural language processing and semantic similarity analysis to provide the first large-scale empirical examination of the relationship between LLM usage and the semantic distinctiveness of research proposals, funding success rates, and subsequent scholarly output. Findings reveal a significant and bimodally distributed increase in LLM use since 2023; higher LLM usage correlates with lower semantic distinctiveness. At NIH, it is associated with greater funding success and increased publication output—primarily in non-highly cited papers—though this effect is absent in NSF submissions, offering critical empirical evidence for AI’s role in research governance.

Technology Category

Application Category

📝 Abstract
Federal research funding shapes the direction, diversity, and impact of the US scientific enterprise. Large language models (LLMs) are rapidly diffusing into scientific practice, holding substantial promise while raising widespread concerns. Despite growing attention to AI use in scientific writing and evaluation, little is known about how the rise of LLMs is reshaping the public funding landscape. Here, we examine LLM involvement at key stages of the federal funding pipeline by combining two complementary data sources: confidential National Science Foundation (NSF) and National Institutes of Health (NIH) proposal submissions from two large US R1 universities, including funded, unfunded, and pending proposals, and the full population of publicly released NSF and NIH awards. We find that LLM use rises sharply beginning in 2023 and exhibits a bimodal distribution, indicating a clear split between minimal and substantive use. Across both private submissions and public awards, higher LLM involvement is consistently associated with lower semantic distinctiveness, positioning projects closer to recently funded work within the same agency. The consequences of this shift are agency-dependent. LLM use is positively associated with proposal success and higher subsequent publication output at NIH, whereas no comparable associations are observed at NSF. Notably, the productivity gains at NIH are concentrated in non-hit papers rather than the most highly cited work. Together, these findings provide large-scale evidence that the rise of LLMs is reshaping how scientific ideas are positioned, selected, and translated into publicly funded research, with implications for portfolio governance, research diversity, and the long-run impact of science.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Federal Research Funding
Scientific Diversity
Proposal Evaluation
Research Impact
Innovation

Methods, ideas, or system contributions that make the work stand out.

large language models
federal research funding
scientific novelty
proposal success
research productivity
🔎 Similar Papers
No similar papers found.
Yifan Qian
Yifan Qian
Research Assistant Professor, Kellogg School of Management, Northwestern University
Science of ScienceInnovationComputational Social ScienceNetwork ScienceMachine Learning
Z
Zhe Wen
Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA; Ryan Institute on Complexity, Northwestern University, Evanston, IL, USA; Northwestern Innovation Institute, Northwestern University, Evanston, IL, USA; Kellogg School of Management, Northwestern University, Evanston, IL, USA
A
Alexander C. Furnas
Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA; Ryan Institute on Complexity, Northwestern University, Evanston, IL, USA; Northwestern Innovation Institute, Northwestern University, Evanston, IL, USA; Kellogg School of Management, Northwestern University, Evanston, IL, USA
Yue Bai
Yue Bai
Northwestern University, Northeastern University
Multi-modal learningSparse network trainingMask learning
E
Erzhuo Shao
Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA; Ryan Institute on Complexity, Northwestern University, Evanston, IL, USA; Northwestern Innovation Institute, Northwestern University, Evanston, IL, USA; McCormick School of Engineering, Northwestern University, Evanston, IL, USA
Dashun Wang
Dashun Wang
Kellogg Chair of Technology, Kellogg School of Management, Northwestern University
Science of ScienceInnovationComputational Social ScienceNetwork ScienceComplex Systems