🤖 AI Summary
This work addresses the challenge of incomparable carbon footprints in federated learning due to inconsistent measurement boundaries and the absence of standardized reporting. To this end, we propose the first standardized, phase-aware carbon accounting framework that integrates NVIDIA NVFlare with CodeCarbon to explicitly track CO₂e emissions across distinct training phases—initialization, per-round training, evaluation, and idle/coordinating periods—and incorporates a configurable network energy model to estimate communication overhead. Experiments on CIFAR-10 and retinal segmentation tasks reveal that system inefficiencies can increase carbon footprints by 8.34–21.73×, while different GPUs induce non-uniform energy consumption patterns, underscoring the necessity of fine-grained, full-lifecycle carbon assessment. The framework enables reproducible research toward greener federated learning.
📝 Abstract
Federated learning (FL) enables collaborative model training over privacy-sensitive, distributed data, but its environmental impact is difficult to compare across studies due to inconsistent measurement boundaries and heterogeneous reporting. We present a practical carbon-accounting methodology for FL CO2e tracking using NVIDIA NVFlare and CodeCarbon for explicit, phase-aware tasks (initialization, per-round training, evaluation, and idle/coordination). To capture non-compute effects, we additionally estimate communication emissions from transmitted model-update sizes under a network-configurable energy model. We validate the proposed approach on two representative workloads: CIFAR-10 image classification and retinal optic disk segmentation. In CIFAR-10, controlled client-efficiency scenarios show that system-level slowdowns and coordination effects can contribute meaningfully to carbon footprint under an otherwise fixed FL protocol, increasing total CO2e by 8.34x (medium) and 21.73x (low) relative to the high-efficiency baseline. In retinal segmentation, swapping GPU tiers (H100 vs.\ V100) yields a consistent 1.7x runtime gap (290 vs. 503 minutes) while producing non-uniform changes in total energy and CO2e across sites, underscoring the need for per-site and per-round reporting. Overall, our results support a standardized carbon accounting method that acts as a prerequisite for reproducible'green'FL evaluation. Our code is available at https://github.com/Pediatric-Accelerated-Intelligence-Lab/carbon_footprint.