A robust methodology for long-term sustainability evaluation of Machine Learning models

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Current AI sustainability evaluations lack standardized, model-agnostic long-term assessment protocols, remaining largely confined to short-term batch learning settings and failing to capture the resource–performance trade-offs inherent in real-world system lifecycles. Method: We propose the first general long-term sustainability evaluation framework applicable across learning paradigms—batch and streaming—incorporating dynamic data evolution simulation, multi-round model updating tracking, fine-grained resource monitoring, and cross-model comparative experiments, validated on classification tasks. Contribution/Results: (1) A model-agnostic long-term evaluation protocol; (2) Empirical evidence that higher environmental cost does not necessarily yield substantial performance gains; (3) Significant inter-model variation in long-term sustainability, demonstrating that conventional static evaluation risks misleading deployment decisions; (4) A reproducible, comparable benchmark for green AI assessment.

Technology Category

Application Category

📝 Abstract

Sustainability and efficiency have become essential considerations in the development and deployment of Artificial Intelligence systems, yet existing regulatory and reporting practices lack standardized, model-agnostic evaluation protocols. Current assessments often measure only short-term experimental resource usage and disproportionately emphasize batch learning settings, failing to reflect real-world, long-term AI lifecycles. In this work, we propose a comprehensive evaluation protocol for assessing the long-term sustainability of ML models, applicable to both batch and streaming learning scenarios. Through experiments on diverse classification tasks using a range of model types, we demonstrate that traditional static train-test evaluations do not reliably capture sustainability under evolving data and repeated model updates. Our results show that long-term sustainability varies significantly across models, and in many cases, higher environmental cost yields little performance benefit.

Problem

Research questions and friction points this paper is trying to address.

Lack standardized model-agnostic sustainability evaluation protocols

Current assessments fail to reflect long-term AI lifecycles

Traditional evaluations cannot reliably capture sustainability under evolving data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized model-agnostic evaluation protocol for sustainability

Applicable to both batch and streaming learning scenarios

Assesses long-term sustainability under evolving data conditions

🔎 Similar Papers

Long-Term Fairness Inquiries and Pursuits in Machine Learning: A Survey of Notions, Methods, and Challenges