Scalability and Maintainability Challenges and Solutions in Machine Learning: Systematic Literature Review

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

This study addresses the inherent tension between scalability and maintainability in machine learning (ML) systems—a critical challenge impeding robust, production-grade deployment. Method: We conduct a systematic literature review (SLR) grounded in 124 high-quality publications, developing the first six-dimensional analytical framework spanning data engineering, model engineering, and system deployment. Contribution/Results: The work identifies 41 categories of maintainability issues and 13 categories of scalability issues, uncovering their stage-crossing trade-offs and synergies. It introduces the first taxonomy of scalability–maintainability challenges in ML systems, accompanied by a problem distribution map and an evidence-based repository quantifying solution effectiveness. Collectively, these findings deliver empirically grounded, cross-stage design principles and actionable optimization pathways for industrial ML system development.

Technology Category

Application Category

📝 Abstract

This systematic literature review examines the critical challenges and solutions related to scalability and maintainability in Machine Learning (ML) systems. As ML applications become increasingly complex and widespread across industries, the need to balance system scalability with long-term maintainability has emerged as a significant concern. This review synthesizes current research and practices addressing these dual challenges across the entire ML life-cycle, from data engineering to model deployment in production. We analyzed 124 papers to identify and categorize 41 maintainability challenges and 13 scalability challenges, along with their corresponding solutions. Our findings reveal intricate inter dependencies between scalability and maintainability, where improvements in one often impact the other. The review is structured around six primary research questions, examining maintainability and scalability challenges in data engineering, model engineering, and ML system development. We explore how these challenges manifest differently across various stages of the ML life-cycle. This comprehensive overview offers valuable insights for both researchers and practitioners in the field of ML systems. It aims to guide future research directions, inform best practices, and contribute to the development of more robust, efficient, and sustainable ML applications across various domains.

Problem

Research questions and friction points this paper is trying to address.

Identifies scalability and maintainability challenges in ML systems

Analyzes solutions across ML lifecycle from data to deployment

Explores interdependencies between scalability and maintainability in ML

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review of ML scalability and maintainability

Analyzes 124 papers for challenges and solutions

Focuses on ML life-cycle from data to deployment

🔎 Similar Papers

A Multivocal Review of MLOps Practices, Challenges and Open Issues