Energy-Aware LLMs: A step towards sustainable AI for downstream applications

📅 2025-03-22

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address the excessive energy consumption of large language models (LLMs) in telecom network fault ticket analysis, this paper proposes an end-to-end energy-efficiency optimization method. We first systematically quantify the energy-efficiency–performance trade-off on real-world root cause analysis and response feedback tasks—an unprecedented effort in this domain. Our approach integrates hybrid low-bit quantization, structured pruning, and task-adaptive fine-tuning, coupled with a dedicated energy consumption modeling and evaluation framework. Experiments on two real telecom fault datasets demonstrate up to 47% energy reduction while improving root cause identification F1-score by 3.2% and response feedback accuracy by 5.8%. The core contribution lies in empirically validating that quantization and pruning can synergistically enhance both accuracy and energy efficiency—establishing a reproducible, rigorously evaluable pathway for deploying LLMs in resource-constrained telecom operations and maintenance scenarios.

Technology Category

Application Category

📝 Abstract

Advanced Large Language Models (LLMs) have revolutionized various fields, including communication networks, sparking an innovation wave that has led to new applications and services, and significantly enhanced solution schemes. Despite all these impressive developments, most LLMs typically require huge computational resources, resulting in terribly high energy consumption. Thus, this research study proposes an end-to-end pipeline that investigates the trade-off between energy efficiency and model performance for an LLM during fault ticket analysis in communication networks. It further evaluates the pipeline performance using two real-world datasets for the tasks of root cause analysis and response feedback in a communication network. Our results show that an appropriate combination of quantization and pruning techniques is able to reduce energy consumption while significantly improving model performance.

Problem

Research questions and friction points this paper is trying to address.

Balancing energy efficiency and performance in LLMs

Reducing energy consumption in fault ticket analysis

Optimizing LLMs for sustainable AI applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end pipeline for energy-performance trade-off

Quantization and pruning reduce energy consumption

Improves model performance in network fault analysis

🔎 Similar Papers

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models