Exploring the Lifecycle and Maintenance Practices of Pre-Trained Models in Open-Source Software Repositories

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This study empirically characterizes, for the first time, the full lifecycle practices of pre-trained models (PTMs) in open-source software (OSS), focusing on integration, evolution, testing, and maintenance challenges. We conduct large-scale mining of GitHub repositories, coupled with cross-platform PTM dependency tracing (Hugging Face, PyTorch Hub), historical commit and issue log analysis, and model metadata parsing. Our analysis systematically identifies recurring risks—including dependency staleness, inadequate documentation, and insufficient test coverage. We propose a novel software engineering analysis framework specifically designed for model dependencies, addressing critical gaps in PTM operationalization and sustainability research. As concrete outcomes, we deliver a reusable PTM maintenance practice guide and a prototype detection tool. These contributions provide both theoretical foundations and practical support for enhancing the maintainability and engineering rigor of AI models within software systems.

Technology Category

Application Category

📝 Abstract

Pre-trained models (PTMs) are becoming a common component in open-source software (OSS) development, yet their roles, maintenance practices, and lifecycle challenges remain underexplored. This report presents a plan for an exploratory study to investigate how PTMs are utilized, maintained, and tested in OSS projects, focusing on models hosted on platforms like Hugging Face and PyTorch Hub. We plan to explore how PTMs are used in open-source software projects and their related maintenance practices by mining software repositories that use PTMs and analyzing their code-base, historical data, and reported issues. This study aims to provide actionable insights into improving the use and sustainability of PTM in open-source projects and a step towards a foundation for advancing software engineering practices in the context of model dependencies.

Problem

Research questions and friction points this paper is trying to address.

Investigating lifecycle challenges of pre-trained models in OSS

Analyzing maintenance practices of PTMs in open-source projects

Improving sustainability of model dependencies in software engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mining software repositories for PTM usage

Analyzing codebase and historical PTM data

Investigating PTM maintenance in OSS projects

🔎 Similar Papers

Naming Practices of Pre-Trained Models in Hugging Face