🤖 AI Summary
This paper addresses the high energy consumption of discriminative AI (e.g., CNNs) and generative AI—particularly large language models (LLMs)—across their full MLOps lifecycle (training, inference, serving). We propose a hardware-software co-designed green AI operations methodology. Our approach integrates software-level power measurement (RAPL/NVML), multi-architecture evaluation (CNN/Transformer/LLM), hyperparameter sensitivity analysis, and hardware–workload–energy modeling. Key contributions include: (i) the first systematic comparative analysis of energy consumption patterns between discriminative and generative models; (ii) the discovery that large models do not necessarily consume more energy under low utilization; and (iii) a reproducible, cross-architecture power measurement benchmark. Experiments demonstrate that joint architecture–hyperparameter–hardware optimization significantly reduces energy consumption without sacrificing accuracy. We quantitatively characterize the nonlinear relationships among LLM energy use, model scale, inference complexity, and request throughput. Finally, we provide practical, carbon-efficiency–oriented AI operations guidelines.
📝 Abstract
This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. For Discriminative models, we examine various architectures and hyperparameters during training and inference and identify energy-efficient practices. For Generative AI, Large Language Models (LLMs) are assessed, focusing primarily on energy consumption across different model sizes and varying service requests. Our study employs software-based power measurements, ensuring ease of replication across diverse configurations, models, and datasets. We analyse multiple models and hardware setups to uncover correlations among various metrics, identifying key contributors to energy consumption. The results indicate that for Discriminative models, optimising architectures, hyperparameters, and hardware can significantly reduce energy consumption without sacrificing performance. For LLMs, energy efficiency depends on balancing model size, reasoning complexity, and request-handling capacity, as larger models do not necessarily consume more energy when utilisation remains low. This analysis provides practical guidelines for designing green and sustainable ML operations, emphasising energy consumption and carbon footprint reductions while maintaining performance. This paper can serve as a benchmark for accurately estimating total energy use across different types of AI models.