Exploring the Lifecycle and Maintenance Practices of Pre-Trained Models in Open-Source Software Repositories

๐Ÿ“… 2025-04-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study empirically characterizes, for the first time, the full lifecycle practices of pre-trained models (PTMs) in open-source software (OSS), focusing on integration, evolution, testing, and maintenance challenges. We conduct large-scale mining of GitHub repositories, coupled with cross-platform PTM dependency tracing (Hugging Face, PyTorch Hub), historical commit and issue log analysis, and model metadata parsing. Our analysis systematically identifies recurring risksโ€”including dependency staleness, inadequate documentation, and insufficient test coverage. We propose a novel software engineering analysis framework specifically designed for model dependencies, addressing critical gaps in PTM operationalization and sustainability research. As concrete outcomes, we deliver a reusable PTM maintenance practice guide and a prototype detection tool. These contributions provide both theoretical foundations and practical support for enhancing the maintainability and engineering rigor of AI models within software systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Pre-trained models (PTMs) are becoming a common component in open-source software (OSS) development, yet their roles, maintenance practices, and lifecycle challenges remain underexplored. This report presents a plan for an exploratory study to investigate how PTMs are utilized, maintained, and tested in OSS projects, focusing on models hosted on platforms like Hugging Face and PyTorch Hub. We plan to explore how PTMs are used in open-source software projects and their related maintenance practices by mining software repositories that use PTMs and analyzing their code-base, historical data, and reported issues. This study aims to provide actionable insights into improving the use and sustainability of PTM in open-source projects and a step towards a foundation for advancing software engineering practices in the context of model dependencies.
Problem

Research questions and friction points this paper is trying to address.

Investigating lifecycle challenges of pre-trained models in OSS
Analyzing maintenance practices of PTMs in open-source projects
Improving sustainability of model dependencies in software engineering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mining software repositories for PTM usage
Analyzing codebase and historical PTM data
Investigating PTM maintenance in OSS projects
๐Ÿ”Ž Similar Papers
No similar papers found.
M
Matin Koohjani
Department of Computer Science and Software Engineering, Concordia University
Diego Elias Costa
Diego Elias Costa
Assistant Professor, Concordia University
Software EngineeringSoftware EcosystemsPerformance EngineeringSE4AI