From Release to Adoption: Challenges in Reusing Pre-trained AI Models for Downstream Developers

📅 2025-06-29

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Downstream developers face significant practical challenges when reusing pre-trained models (PTMs) in software systems. Method: We conducted an empirical study analyzing 840 PTM-related issue reports from 31 open-source GitHub projects, applying systematic coding and thematic analysis to identify recurring challenges. Contribution/Results: We propose the first empirically grounded taxonomy of PTM reuse challenges, comprising seven categories: model usage, performance degradation, output quality, dependency management, documentation gaps, debugging difficulties, and migration/adaptation issues. Statistical testing reveals that PTM-related issues take significantly longer to resolve than general issues (p < 0.01), highlighting them as a critical bottleneck in PTM adoption. The taxonomy underwent rigorous expert validation and inter-coder reliability assessment, demonstrating high credibility and extensibility. It provides an evidence-based, structured foundation for developing tooling support, improving documentation practices, and guiding engineering workflows in PTM-integrated software development.

Technology Category

Application Category

📝 Abstract

Pre-trained models (PTMs) have gained widespread popularity and achieved remarkable success across various fields, driven by their groundbreaking performance and easy accessibility through hosting providers. However, the challenges faced by downstream developers in reusing PTMs in software systems are less explored. To bridge this knowledge gap, we qualitatively created and analyzed a dataset of 840 PTM-related issue reports from 31 OSS GitHub projects. We systematically developed a comprehensive taxonomy of PTM-related challenges that developers face in downstream projects. Our study identifies seven key categories of challenges that downstream developers face in reusing PTMs, such as model usage, model performance, and output quality. We also compared our findings with existing taxonomies. Additionally, we conducted a resolution time analysis and, based on statistical tests, found that PTM-related issues take significantly longer to be resolved than issues unrelated to PTMs, with significant variation across challenge categories. We discuss the implications of our findings for practitioners and possibilities for future research.

Problem

Research questions and friction points this paper is trying to address.

Challenges in reusing pre-trained AI models for developers

Identifying key categories of PTM-related issues in projects

Resolution time analysis of PTM-related versus non-PTM issues

Innovation

Methods, ideas, or system contributions that make the work stand out.

Qualitative analysis of 840 PTM issue reports

Developed taxonomy of PTM reuse challenges

Resolution time analysis for PTM-related issues

🔎 Similar Papers

No similar papers found.