🤖 AI Summary
This study addresses the limited empirical understanding of green architecture practices in machine learning systems and the absence of documented sustainability strategies in the literature. It presents the first systematic investigation that combines large language models with software repository analysis to mine green practices across 205 open-source ML projects, resulting in a novel taxonomy of green AI architectural tactics. The work not only validates the real-world adoption of existing tactics but also identifies nine previously undocumented sustainability tactics. To enhance practical utility, the study provides reproducible code examples, offering developers actionable guidance for implementing environmentally sustainable practices in ML system development and thereby advancing the field toward greener AI engineering.
📝 Abstract
Context: The increasing adoption of machine learning (ML) and artificial intelligence (AI) technologies raises growing concerns about their environmental sustainability. Developing and deploying ML-enabled systems is computationally intensive, particularly during training and inference. Green AI has emerged to address these issues by promoting efficiency without sacrificing accuracy. While prior research has proposed catalogs of sustainable practices (i.e., green tactics), there remains limited understanding of their adoption in practice and whether additional, undocumented tactics exist. Objective: This study aims to investigate the extent to which existing sustainable practices are implemented in real-world ML-enabled systems and to identify previously undocumented practices that support environmental sustainability. Method: We conduct a mining software repository study on 205 open-source ML projects on GitHub. To support our analysis, we design a novel mechanism based on large language models (LLMs) capable of identifying both known and new sustainable practices from code repositories. Results: Our findings confirm that green tactics reported in the literature are used in practice, although adoption rates vary. Furthermore, our LLM-based approach reveals nine previously undocumented sustainable practices. Each tactic is supported with code examples to aid adoption and integration. Conclusions: We finally provide insights for practitioners seeking to reduce the environmental impact of ML-enabled systems and offer a foundation for future research in automating the detection and adoption of sustainable practices.