How Green are Neural Language Models? Analyzing Energy Consumption in Text Summarization Fine-tuning

📅 2025-01-26

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

The environmental cost of fine-tuning large language models (LLMs) for scientific abstract generation remains poorly quantified, especially when comparing established encoder-decoder models with emerging open-weight LLMs. Method: We systematically evaluate energy efficiency and performance trade-offs during fine-tuning of T5-base, BART-base, and LLaMA 3-8B on scientific abstract generation. Power consumption is empirically measured and mapped to carbon footprint; semantic quality is assessed multidimensionally using ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore. Contribution/Results: We provide the first comparative analysis of embodied environmental costs across mid-scale pretrained models and modern open LLMs. Results show nonlinear energy growth with parameter count: LLaMA 3-8B incurs substantially higher carbon emissions than T5-base or BART-base during fine-tuning, yet yields only marginal improvements in abstract quality. This reveals critical efficiency bottlenecks in green LLM deployment and establishes a reproducible, quantitative benchmark to guide low-carbon AI development and model selection.

Technology Category

Application Category

📝 Abstract

Artificial intelligence systems significantly impact the environment, particularly in natural language processing (NLP) tasks. These tasks often require extensive computational resources to train deep neural networks, including large-scale language models containing billions of parameters. This study analyzes the trade-offs between energy consumption and performance across three neural language models: two pre-trained models (T5-base and BART-base), and one large language model (LLaMA 3-8B). These models were fine-tuned for the text summarization task, focusing on generating research paper highlights that encapsulate the core themes of each paper. A wide range of evaluation metrics, including ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore, were employed to assess their performance. Furthermore, the carbon footprint associated with fine-tuning each model was measured, offering a comprehensive assessment of their environmental impact. This research underscores the importance of incorporating environmental considerations into the design and implementation of neural language models and calls for the advancement of energy-efficient AI methodologies.

Problem

Research questions and friction points this paper is trying to address.

Neural Language Models

Text Summarization

Environmental Impact

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Language Models

Energy Efficiency

Environmental Impact

🔎 Similar Papers

No similar papers found.