🤖 AI Summary
This work addresses the prevalent misuse of machine learning (ML) cloud services in software systems, which often degrades system quality and maintainability yet lacks effective detection mechanisms. To tackle this issue, the authors propose MLmisFinder, the first approach to systematically define and automatically identify seven representative categories of ML service misuse. The method integrates meta-modeling with rule-driven static analysis and introduces specialized detection algorithms tailored to scenarios such as data drift monitoring and schema validation. Evaluated on 107 open-source projects, MLmisFinder achieves an average precision of 96.7% and recall of 97%, substantially outperforming baseline techniques. Furthermore, its successful application to 817 additional systems demonstrates the widespread nature of ML service misuse in real-world software.
📝 Abstract
Machine Learning (ML) cloud services, offered by leading providers such as Amazon, Google, and Microsoft, enable the integration of ML components into software systems without building models from scratch. However, the rapid adoption of ML services, coupled with the growing complexity of business requirements, has led to widespread misuses, compromising the quality, maintainability, and evolution of ML service-based systems. Though prior research has studied patterns and antipatterns in service-based and ML-based systems separately, automatic detection of ML service misuses remains a challenge. In this paper, we propose MLmisFinder, an automatic approach to detect ML service misuses in software systems, aiming to identify instances of improper use of ML services to help developers properly integrate ML components in ML service-based systems. We propose a metamodel that captures the data needed to detect misuses in ML service-based systems and apply a set of rule-based detection algorithms for seven misuse types. We evaluated MLmisFinder on 107 software systems collected from open-source GitHub repositories and compared it with a state-of-the-art baseline. Our results show that MLmisFinder effectively detects ML service misuses, achieving an average precision of 96.7\% and recall of 97\%, outperforming the state-of-the-art baseline. MLmisFinder also scaled efficiently to detect misuses across 817 ML service-based systems and revealed that such misuses are widespread, especially in areas such as data drift monitoring and schema validation.