Is Your Training Pipeline Production-Ready? A Case Study in the Healthcare Domain

📅 2025-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical AI deployment is hindered by insufficient production readiness of machine learning (ML) training pipelines. Method: This paper presents a progressive architectural evolution path—monolithic (chaotic) → modular monolithic → microservices—using SPIRA, a voice-based pre-diagnostic system for respiratory insufficiency, as a case study. It systematically introduces continuous training (CT) and a software-quality-attribute-driven MLOps governance framework tailored to healthcare, integrating modular design, microservice decomposition, and engineered CI/CD pipelines. Contribution/Results: The approach significantly improves pipeline maintainability, fault tolerance, and scalability, enabling stable, iterative evolution of SPIRA. It establishes an “agile ML + robust software engineering” co-design paradigm, delivering a reusable methodology and practical benchmark for engineering medical AI in highly regulated environments.

Technology Category

Application Category

📝 Abstract
Deploying a Machine Learning (ML) training pipeline into production requires robust software engineering practices. This differs significantly from experimental workflows. This experience report investigates this challenge in SPIRA, a project whose goal is to create an ML-Enabled System (MLES) to pre-diagnose insufficiency respiratory via speech analysis. The first version of SPIRA's training pipeline lacked critical software quality attributes. This paper presents an overview of the MLES, then compares three versions of the architecture of the Continuous Training subsystem, which evolved from a Big Ball of Mud, to a Modular Monolith, towards Microservices. By adopting different design principles and patterns to enhance its maintainability, robustness, and extensibility. In this way, the paper seeks to offer insights for both ML Engineers tasked to productionize ML training pipelines and Data Scientists seeking to adopt MLOps practices.
Problem

Research questions and friction points this paper is trying to address.

Ensuring ML training pipelines are production-ready in healthcare
Improving software quality in MLES for respiratory pre-diagnosis
Evolving architecture for better maintainability and robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolved architecture from Big Ball of Mud to Microservices
Enhanced maintainability, robustness, and extensibility
Adopted MLOps practices for production-ready pipelines
🔎 Similar Papers
D
Daniel Lawand
Instituto de Matemática e Estatística (IME), University of São Paulo (USP), Brazil
L
Lucas Quaresma
Instituto de Matemática e Estatística (IME), University of São Paulo (USP), Brazil
R
Roberto Bolgheroni
Instituto de Matemática e Estatística (IME), University of São Paulo (USP), Brazil
Alfredo Goldman
Alfredo Goldman
Associate Professor of Computer Science, University of São Paulo
HPCDistributed SystemsAgile MethodsTechnical Debt
R
Renato Cordeiro Ferreira
Instituto de Matemática e Estatística (IME), University of São Paulo (USP), Brazil; Jheronimus Academy of Data Science (JADS), Tilburg University (TiU) and Technical University of Eindhoven (TUe), The Netherlands