🤖 AI Summary
To address the high carbon footprint of large language model (LLM) services in wireless environments—and the prevailing neglect of communication-related emissions in existing studies—this paper proposes the first end-to-end carbon footprint quantification model that jointly characterizes greenhouse gas emissions from both inference computation and wireless transmission. Building upon this model, we design a low-carbon-oriented deep reinforcement learning algorithm, termed SDRL, which employs a spiking neural network (SNN) as the actor to jointly optimize inference output quality and base station transmit power under QoE and system performance constraints. Compared to the soft Actor-Critic baseline, SDRL reduces total carbon footprint by 18.77%, significantly enhancing the energy efficiency and sustainability of wireless LLM services. Key contributions include: (i) the first holistic carbon modeling framework for wireless LLMs spanning the entire service chain, and (ii) an SNN-driven reinforcement learning architecture enabling co-optimization of computation and communication for carbon reduction.
📝 Abstract
Recent advancements in large language models (LLMs) have led to their widespread adoption and large-scale deployment across various domains. However, their environmental impact, particularly during inference, has become a growing concern due to their substantial energy consumption and carbon footprint. Existing research has focused on inference computation alone, overlooking the analysis and optimization of carbon footprint in network-aided LLM service systems. To address this gap, we propose AOLO, a framework for analysis and optimization for low-carbon oriented wireless LLM services. AOLO introduces a comprehensive carbon footprint model that quantifies greenhouse gas emissions across the entire LLM service chain, including computational inference and wireless communication. Furthermore, we formulate an optimization problem aimed at minimizing the overall carbon footprint, which is solved through joint optimization of inference outputs and transmit power under quality-of-experience and system performance constraints. To achieve this joint optimization, we leverage the energy efficiency of spiking neural networks (SNNs) by adopting SNN as the actor network and propose a low-carbon-oriented optimization algorithm, i.e., SNN-based deep reinforcement learning (SDRL). Comprehensive simulations demonstrate that SDRL algorithm significantly reduces overall carbon footprint, achieving an 18.77% reduction compared to the benchmark soft actor-critic, highlighting its potential for enabling more sustainable LLM inference services.