LLMs as Packagers of HPC Software

📅 2025-11-07

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

HPC software ecosystems exhibit high heterogeneity and complex dependency graphs; while package managers like Spack automate dependency resolution, their recipes (package.py files) require labor-intensive manual authoring and maintenance—costs that scale prohibitively with ecosystem size. This work introduces SpackIt, the first LLM-driven framework for automated Spack recipe generation in HPC. It integrates static codebase analysis to extract build metadata, retrieves semantically similar existing recipes via RAG-enhanced retrieval, and employs a diagnostic feedback loop for iterative recipe refinement—enabling end-to-end generation of deployable Spack recipes from source code. Evaluated on 308 real-world HPC packages, SpackIt increases zero-shot installation success rate from 20% to over 80%. The approach substantially lowers the barrier to recipe development and significantly improves software reproducibility and portability across diverse HPC environments.

Technology Category

Application Category

📝 Abstract

High performance computing (HPC) software ecosystems are inherently heterogeneous, comprising scientific applications that depend on hundreds of external packages, each with distinct build systems, options, and dependency constraints. Tools such as Spack automate dependency resolution and environment management, but their effectiveness relies on manually written build recipes. As these ecosystems grow, maintaining existing specifications and creating new ones becomes increasingly labor-intensive. While large language models (LLMs) have shown promise in code generation, automatically producing correct and maintainable Spack recipes remains a significant challenge. We present a systematic analysis of how LLMs and context-augmentation methods can assist in the generation of Spack recipes. To this end, we introduce SpackIt, an end-to-end framework that combines repository analysis, retrieval of relevant examples, and iterative refinement through diagnostic feedback. We apply SpackIt to a representative subset of 308 open-source HPC packages to assess its effectiveness and limitations. Our results show that SpackIt increases installation success from 20% in a zero-shot setting to over 80% in its best configuration, demonstrating the value of retrieval and structured feedback for reliable package synthesis.

Problem

Research questions and friction points this paper is trying to address.

Automating HPC software packaging to handle diverse dependencies and build systems

Reducing manual effort in creating and maintaining Spack build recipes

Improving LLM-generated Spack recipe correctness through structured feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs generate HPC software build recipes

Context-augmentation retrieves relevant package examples

Iterative refinement with diagnostic feedback improves installation

🔎 Similar Papers

No similar papers found.