Survey of HPC in US Research Institutions

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

248K/year

🤖 AI Summary

U.S. academic HPC resources are growing at a compound annual growth rate (CAGR) of only 18%, substantially lagging behind national (43%) and industrial (78%) rates—revealing a structural capability gap, particularly for GPU-intensive AI workloads. Method: This study systematically assesses the current state of HPC infrastructure across U.S. universities, benchmarking against DOE leadership-class systems and industrial AI infrastructure through integrated analysis of computational capacity, architectural evolution, governance models, and energy efficiency. Contribution/Results: We propose a novel federated computing framework, a dynamic idle-GPU scheduling mechanism, and a fair cost-allocation model; additionally, we explore a decentralized reinforcement learning paradigm to enhance accessibility and sustainability of campus AI training resources. The findings provide empirically grounded, actionable recommendations for optimizing academic HPC policy and fostering cross-sectoral technological collaboration.

Technology Category

Application Category

📝 Abstract

The rapid growth of AI, data-intensive science, and digital twin technologies has driven an unprecedented demand for high-performance computing (HPC) across the research ecosystem. While national laboratories and industrial hyperscalers have invested heavily in exascale and GPU-centric architectures, university-operated HPC systems remain comparatively under-resourced. This survey presents a comprehensive assessment of the HPC landscape across U.S. universities, benchmarking their capabilities against Department of Energy (DOE) leadership-class systems and industrial AI infrastructures. We examine over 50 premier research institutions, analyzing compute capacity, architectural design, governance models, and energy efficiency. Our findings reveal that university clusters, though vital for academic research, exhibit significantly lower growth trajectories (CAGR $approx$ 18%) than their national ($approx$ 43%) and industrial ($approx$ 78%) counterparts. The increasing skew toward GPU-dense AI workloads has widened the capability gap, highlighting the need for federated computing, idle-GPU harvesting, and cost-sharing models. We also identify emerging paradigms, such as decentralized reinforcement learning, as promising opportunities for democratizing AI training within campus environments. Ultimately, this work provides actionable insights for academic leaders, funding agencies, and technology partners to ensure more equitable and sustainable HPC access in support of national research priorities.

Problem

Research questions and friction points this paper is trying to address.

Assessing HPC capabilities in US universities versus national and industrial systems

Identifying growth disparities in university HPC compared to national and industrial systems

Exploring solutions like federated computing to bridge HPC capability gaps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated computing for resource sharing

Idle-GPU harvesting to optimize utilization

Decentralized reinforcement learning democratization

🔎 Similar Papers

No similar papers found.