Ten simple rules for training scientists to make better software

📅 2024-02-07
🏛️ PLoS Comput. Biol.
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Doctoral students in life sciences commonly lack formal software engineering training, hindering the development of robust, reproducible, and collaborative research software. Method: This study proposes ten pedagogical principles for research software development, establishing the first systematic framework centered on “research software pedagogy”—distinct from generic programming instruction. It integrates software engineering best practices (e.g., Git-based version control, CI/CD pipelines, unit testing, RESTful API design), learning science principles, and authentic research workflows, emphasizing the seamless embedding of automation, documentation, testing, and collaborative practices throughout the research lifecycle. Contribution/Results: The framework delivers a generalizable, plug-and-play pedagogical paradigm. Deployed across multiple Chinese universities’ life sciences PhD programs, it has demonstrably improved software deliverable quality, code reusability, and cross-team collaboration efficiency—bridging critical gaps between computational literacy and rigorous, team-based scientific software practice.

Technology Category

Application Category

📝 Abstract
Computational methods and associated software implementations are central to every field of scientific investigation. Modern biological research, particularly within systems biology, has relied heavily on the development of software tools to process and organize increasingly large datasets, simulate complex mechanistic models, provide tools for the analysis and management of data, and visualize and organize outputs. However, developing high-quality research software requires scientists to develop a host of software development skills, and teaching these skills to students is challenging. There has been a growing importance placed on ensuring reproducibility and good development practices in computational research. However, less attention has been devoted to informing the specific teaching strategies which are effective at nurturing in researchers the complex skillset required to produce high-quality software that, increasingly, is required to underpin both academic and industrial biomedical research. Recent articles in the Ten Simple Rules collection have discussed the teaching of foundational computer science and coding techniques to biology students. We advance this discussion by describing the specific steps for effectively teaching the necessary skills scientists need to develop sustainable software packages which are fit for (re-)use in academic research or more widely. Although our advice is likely to be applicable to all students and researchers hoping to improve their software development skills, our guidelines are directed towards an audience of students that have some programming literacy but little formal training in software development or engineering, typical of early doctoral students. These practices are also applicable outside of doctoral training environments, and we believe they should form a key part of postgraduate training schemes more generally in the life sciences.
Problem

Research questions and friction points this paper is trying to address.

Teaching scientists to develop high-quality, sustainable software.
Addressing the lack of formal software development training in research.
Enhancing reproducibility and good practices in computational research.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Teaching sustainable software development skills
Focusing on reproducibility and good practices
Targeting students with basic programming knowledge
🔎 Similar Papers
No similar papers found.
K
K. Gallagher
Doctoral Training Centre, University of Oxford, UK
R
R. Creswell
Department of Computer Science, University of Oxford, UK
Ben Lambert
Ben Lambert
University of Oxford
Bayesian inferenceepidemiologymathematical biology
M
M. Robinson
Department of Computer Science, University of Oxford, UK
C
Chon Lok Lei
Faculty of Health Sciences, University of Macau, Macau, China
G
Gary R. Mirams
Centre for Mathematical Medicine & Biology, School of Mathematical Sciences, University of Nottingham, UK
D
D. Gavaghan
Doctoral Training Centre, University of Oxford, UK