Demystifying the Evolution of Neural Networks with BOM Analysis: Insights from a Large-Scale Study of 55,997 GitHub Repositories

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of adaptation and evolution analysis tools for AI software, this paper proposes the Neural Network Bill of Materials (NNBOM) model—the first framework enabling empirical evolutionary studies across large-scale AI software ecosystems. Leveraging 55,997 open-source PyTorch projects, we construct an NNBOM database that integrates Software Bill of Materials (SBOM) principles with AI-specific artifacts, systematically characterizing long-term evolutionary patterns of pre-trained models and modular components in terms of scale growth, cross-domain dependencies, and reuse practices. Methodologically, we combine empirical software engineering with data mining techniques to uncover AI-specific evolutionary paradigms distinct from traditional software. Our contributions include: (1) the first scalable NNBOM data model and supporting empirical infrastructure; and (2) two prototype tools—a multi-repository collaborative evolution analysis platform and a single-repository component assessment and recommendation system—to aid developer decision-making.

Technology Category

Application Category

📝 Abstract
Neural networks have become integral to many fields due to their exceptional performance. The open-source community has witnessed a rapid influx of neural network (NN) repositories with fast-paced iterations, making it crucial for practitioners to analyze their evolution to guide development and stay ahead of trends. While extensive research has explored traditional software evolution using Software Bill of Materials (SBOMs), these are ill-suited for NN software, which relies on pre-defined modules and pre-trained models (PTMs) with distinct component structures and reuse patterns. Conceptual AI Bills of Materials (AIBOMs) also lack practical implementations for large-scale evolutionary analysis. To fill this gap, we introduce the Neural Network Bill of Material (NNBOM), a comprehensive dataset construct tailored for NN software. We create a large-scale NNBOM database from 55,997 curated PyTorch GitHub repositories, cataloging their TPLs, PTMs, and modules. Leveraging this database, we conduct a comprehensive empirical study of neural network software evolution across software scale, component reuse, and inter-domain dependency, providing maintainers and developers with a holistic view of its long-term trends. Building on these findings, we develop two prototype applications, extit{Multi repository Evolution Analyzer} and extit{Single repository Component Assessor and Recommender}, to demonstrate the practical value of our analysis.
Problem

Research questions and friction points this paper is trying to address.

Analyzing neural network software evolution using traditional SBOMs is inadequate
Existing AIBOM concepts lack practical implementations for large-scale evolutionary analysis
Understanding NN component reuse patterns across repositories remains challenging
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduced Neural Network Bill of Material (NNBOM) dataset construct
Created large-scale NNBOM database from 55,997 PyTorch repositories
Developed two prototype applications for evolution analysis
🔎 Similar Papers
No similar papers found.