Promoting Sustainable Web Agents: Benchmarking and Estimating Energy Consumption through Empirical and Theoretical Analysis

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Web agents (e.g., Operator, Project Mariner) extend the capabilities of large language models (LLMs) to interactive web environments, yet their energy consumption and carbon footprint—critical sustainability concerns—remain unassessed systematically. This work presents the first comprehensive study of energy usage and carbon emissions in web agents, integrating theoretical modeling with empirical benchmarking across diverse architectural choices—including action granularity, planning depth, and tool invocation strategies. We find that higher energy consumption does not consistently improve task performance, and current systems suffer from opaque parameters and execution workflows, hindering accurate energy estimation. Accordingly, we propose treating energy consumption as a primary evaluation metric and advocate for a new sustainability-oriented evaluation framework. We emphasize auditability of both models and execution traces to enable transparent, reproducible energy accounting. Our methodology and recommendations provide foundational guidance for developing green, energy-efficient web agents.

Technology Category

Application Category

📝 Abstract
Web agents, like OpenAI's Operator and Google's Project Mariner, are powerful agentic systems pushing the boundaries of Large Language Models (LLM). They can autonomously interact with the internet at the user's behest, such as navigating websites, filling search masks, and comparing price lists. Though web agent research is thriving, induced sustainability issues remain largely unexplored. To highlight the urgency of this issue, we provide an initial exploration of the energy and $CO_2$ cost associated with web agents from both a theoretical -via estimation- and an empirical perspective -by benchmarking. Our results show how different philosophies in web agent creation can severely impact the associated expended energy, and that more energy consumed does not necessarily equate to better results. We highlight a lack of transparency regarding disclosing model parameters and processes used for some web agents as a limiting factor when estimating energy consumption. Our work contributes towards a change in thinking of how we evaluate web agents, advocating for dedicated metrics measuring energy consumption in benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Measuring energy and CO2 costs of web agents through empirical benchmarking
Analyzing how different web agent designs impact energy consumption efficiency
Addressing lack of transparency in model parameters for energy estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking web agent energy consumption empirically
Estimating energy costs through theoretical analysis
Advocating dedicated energy metrics in evaluations
🔎 Similar Papers
No similar papers found.
L
L. Krupp
RPTU Kaiserslautern-Landau, German Research Center for Artificial Intelligence (DFKI)
D
Daniel Geissler
RPTU Kaiserslautern-Landau, German Research Center for Artificial Intelligence (DFKI)
V
Vishal Banwari
RPTU Kaiserslautern-Landau, German Research Center for Artificial Intelligence (DFKI)
P
P. Lukowicz
RPTU Kaiserslautern-Landau, German Research Center for Artificial Intelligence (DFKI)
Jakob Karolus
Jakob Karolus
DFKI and RPTU Kaiserslautern-Landau
Human-Centric Artificial IntelligencePhysiological Sensing