Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study

📅 2026-02-26

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work proposes a curtailment-aware elastic distributed training framework for large language models (LLMs) that aligns full-parameter pretraining with temporal windows of renewable energy curtailment—periods when surplus clean electricity is otherwise wasted due to grid constraints. By dynamically switching between single-site training and federated synchronization across geographically distributed GPU clusters, the method leverages otherwise unused renewable energy. Implemented on the Flower framework, we train a 561M-parameter Transformer model across three distributed clusters, scheduling computation using real-world marginal carbon intensity data. Experimental results demonstrate that our approach maintains model convergence and performance while reducing operational carbon emissions to just 5%–12% of those incurred by a single-site baseline, achieving both low-carbon and cost-efficient LLM pretraining.

Technology Category

Application Category

📝 Abstract

Training large language models (LLMs) requires substantial compute and energy. At the same time, renewable energy sources regularly produce more electricity than the grid can absorb, leading to curtailment, the deliberate reduction of clean generation that would otherwise go to waste. These periods represent an opportunity: if training is aligned with curtailment windows, LLMs can be pretrained using electricity that is both clean and cheap. This technical report presents a system that performs full-parameter LLM training across geo-distributed GPU clusters during regional curtailment windows, elastically switching between local single-site training and federated multi-site synchronization as sites become available or unavailable. Our prototype trains a 561M-parameter transformer model across three clusters using the Flower federated learning framework, with curtailment periods derived from real-world marginal carbon intensity traces. Preliminary results show that curtailment-aware scheduling preserves training quality while reducing operational emissions to 5-12% of single-site baselines.

Problem

Research questions and friction points this paper is trying to address.

renewable curtailment

distributed LLM pretraining

clean energy utilization

federated learning

carbon-aware computing

Innovation

Methods, ideas, or system contributions that make the work stand out.

renewable curtailment

distributed LLM pretraining

federated learning