🤖 AI Summary
Data centers’ high energy consumption exacerbates global carbon emissions, while existing Kubernetes carbon-aware scheduling relies on centralized machine learning models—suffering from privacy leakage and poor cross-domain generalizability. To address these challenges, this paper proposes the first federated learning-based energy consumption prediction framework for container orchestration. We extend the Kepler monitoring system using Flower and XGBoost, and introduce FedXgbBagging—a novel federated aggregation strategy tailored for distributed energy-efficiency modeling. This enables privacy-preserving, decentralized collaborative training across organizational boundaries. Evaluated on the SPECPower dataset, our framework reduces mean absolute error by 11.7% compared to centralized baselines, significantly improving prediction accuracy and enhancing the practical feasibility of green, carbon-aware scheduling in production Kubernetes environments.
📝 Abstract
The growing reliance on large-scale data centers to run resource-intensive workloads has significantly increased the global carbon footprint, underscoring the need for sustainable computing solutions. While container orchestration platforms like Kubernetes help optimize workload scheduling to reduce carbon emissions, existing methods often depend on centralized machine learning models that raise privacy concerns and struggle to generalize across diverse environments. In this paper, we propose a federated learning approach for energy consumption prediction that preserves data privacy by keeping sensitive operational data within individual enterprises. By extending the Kubernetes Efficient Power Level Exporter (Kepler), our framework trains XGBoost models collaboratively across distributed clients using Flower's FedXgbBagging aggregation using a bagging strategy, eliminating the need for centralized data sharing. Experimental results on the SPECPower benchmark dataset show that our FL-based approach achieves 11.7 percent lower Mean Absolute Error compared to a centralized baseline. This work addresses the unresolved trade-off between data privacy and energy prediction efficiency in prior systems such as Kepler and CASPER and offers enterprises a viable pathway toward sustainable cloud computing without compromising operational privacy.