🤖 AI Summary
Scientific open-source software (Sci-OSS) faces severe challenges to long-term sustainability. This study addresses this problem by empirically analyzing ten representative Sci-OSS projects on GitHub along two core dimensions: community engagement and software quality. Our multimodal methodology integrates repository metadata, natural language processing of commit and issue texts, statistical modeling, and code quality metrics; we further propose a novel time-series visualization technique to unify the dynamic evolution of heterogeneous indicators. Results reveal substantial heterogeneity in sustainability trajectories—even among projects within the same scientific domain. The work delivers a reusable assessment framework, empirically grounded insights, and actionable intervention tools for researchers, funding agencies, and developers. It constitutes the first systematic investigation into the multidimensional drivers and divergent evolutionary patterns underlying Sci-OSS sustainability.
📝 Abstract
Scientific open-source software (Sci-OSS) projects are critical for advancing research, yet sustaining these projects long-term remains a major challenge. This paper explores the sustainability of Sci-OSS hosted on GitHub, focusing on two factors drawn from stewardship organizations: community engagement and software quality. We map sustainability to repository metrics from the literature and mined data from ten prominent Sci-OSS projects. A multimodal analysis of these projects led us to a novel visualization technique, providing a robust way to display both current and evolving software metrics over time, replacing multiple traditional visualizations with one. Additionally, our statistical analysis shows that even similar-domain projects sustain themselves differently. Natural language analysis supports claims from the literature, highlighting that project-specific feedback plays a key role in maintaining software quality. Our visualization and analysis methods offer researchers, funders, and developers key insights into long-term software sustainability.