Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective

📅 2023-02-09

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 2

career value

236K/year

🤖 AI Summary

Real-world deployment of AI systems faces dual challenges: heterogeneous data floods and stringent low-latency requirements—straining conventional software architectures to their limits. To address this, we conduct the first systematic analysis of 217 production-deployed ML systems from a Data-Oriented Architecture (DOA) perspective, uncovering implicit DOA design patterns—particularly in loose coupling, decentralization, and data-driven orchestration. Leveraging systematic literature review, architectural pattern extraction, and requirement-to-design mapping analysis, we synthesize the first empirically grounded DOA practice guide and open-challenge taxonomy tailored to real-world ML deployments. Furthermore, we propose actionable, reusable recommendations for ML system deployment and introduce a novel architecture evaluation framework. Collectively, these contributions bridge the critical knowledge gap between DOA theory and industrial engineering practice.

📝 Abstract

Machine Learning models are being deployed as parts of real-world systems with the upsurge of interest in artificial intelligence. The design, implementation, and maintenance of such systems are challenged by real-world environments that produce larger amounts of heterogeneous data and users requiring increasingly faster responses with efficient resource consumption. These requirements push prevalent software architectures to the limit when deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging concept that equips systems better for integrating ML models. DOA extends current architectures to create data-driven, loosely coupled, decentralised, open systems. Even though papers on deployed ML-based systems do not mention DOA, their authors made design decisions that implicitly follow DOA. The reasons why, how, and the extent to which DOA is adopted in these systems are unclear. Implicit design decisions limit the practitioners' knowledge of DOA to design ML-based systems in the real world. This paper answers these questions by surveying real-world deployments of ML-based systems. The survey shows the design decisions of the systems and the requirements these satisfy. Based on the survey findings, we also formulate practical advice to facilitate the deployment of ML-based systems. Finally, we outline open challenges to deploying DOA-based systems that integrate ML models.

Problem

Research questions and friction points this paper is trying to address.

Surveying DOA adoption for ML system deployment challenges

Addressing knowledge gap in implicit DOA design decisions

Providing practical advice for ML-based system requirements

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-oriented Architecture for ML integration

Systematic survey on DOA adoption practices

Big Data management with low latency

🔎 Similar Papers

A Large-Scale Study of Model Integration in ML-Enabled Software Systems