Quality Time: Carbon-Aware Quality Adaptation for Energy-Intensive Services

📅 2024-11-28
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Addressing carbon emissions from energy-intensive cloud services—particularly generative AI—this work tackles the challenge of reducing operational carbon footprints while maintaining service quality and regulatory compliance. Method: We propose a carbon-aware Quality-of-Experience (QoE) orchestration framework that treats LLM response quality (e.g., output length, accuracy) as an adjustable dimension for carbon optimization. The framework dynamically adapts to real-time grid carbon intensity under latency and data locality constraints, integrating multi-timescale carbon intensity forecasting, annual hard carbon-budget-constrained multi-objective integer programming, online feedback control, and hierarchical SLA modeling. Contribution/Results: Evaluated on large-scale LLM inference workloads, our approach achieves up to 10% reduction in service-related carbon emissions—amounting to tens of thousands of tons of CO₂ annually—while preserving user availability and ensuring adherence to environmental regulations.

Technology Category

Application Category

📝 Abstract
The energy demand of modern cloud services, particularly those related to generative AI, is increasing at an unprecedented pace. While hyperscalers collectively fail to meet their self-imposed emission reduction targets, they face increasing pressure from environmental sustainability reporting across many jurisdictions. To date, carbon-aware computing strategies have primarily focused on batch process scheduling or geo-distributed load balancing. However, such approaches are not applicable to services that require constant availability at specific locations due to latency, privacy, data, or infrastructure constraints. In this paper, we explore how the carbon footprint of energy-intensive services can be reduced by adjusting the fraction of requests served by different service quality tiers. We show that adapting this quality of responses with respect to grid carbon intensity can lead to additional carbon savings beyond resource and energy efficiency. Building on this, we introduce a forecast-based multi-horizon optimization that reaches close-to-optimal carbon savings and is able to automatically adapt service quality for best-effort users to stay within an annual carbon budget. Our approach can reduce the emissions of large-scale LLM services, which we estimate at multiple 10,000 tons of CO2 annually, by up to 10%.
Problem

Research questions and friction points this paper is trying to address.

Energy Efficiency
Cloud Services
Environmental Impact
Innovation

Methods, ideas, or system contributions that make the work stand out.

Energy Efficiency
Dynamic Service Adjustment
Carbon Emission Reduction
🔎 Similar Papers
No similar papers found.