Promoting Sustainable Web Agents: Benchmarking and Estimating Energy Consumption through Empirical and Theoretical Analysis

📅 2025-11-06

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Web agents (e.g., Operator, Project Mariner) extend the capabilities of large language models (LLMs) to interactive web environments, yet their energy consumption and carbon footprint—critical sustainability concerns—remain unassessed systematically. This work presents the first comprehensive study of energy usage and carbon emissions in web agents, integrating theoretical modeling with empirical benchmarking across diverse architectural choices—including action granularity, planning depth, and tool invocation strategies. We find that higher energy consumption does not consistently improve task performance, and current systems suffer from opaque parameters and execution workflows, hindering accurate energy estimation. Accordingly, we propose treating energy consumption as a primary evaluation metric and advocate for a new sustainability-oriented evaluation framework. We emphasize auditability of both models and execution traces to enable transparent, reproducible energy accounting. Our methodology and recommendations provide foundational guidance for developing green, energy-efficient web agents.

Technology Category

Application Category

📝 Abstract

Web agents, like OpenAI's Operator and Google's Project Mariner, are powerful agentic systems pushing the boundaries of Large Language Models (LLM). They can autonomously interact with the internet at the user's behest, such as navigating websites, filling search masks, and comparing price lists. Though web agent research is thriving, induced sustainability issues remain largely unexplored. To highlight the urgency of this issue, we provide an initial exploration of the energy and $CO_2$ cost associated with web agents from both a theoretical -via estimation- and an empirical perspective -by benchmarking. Our results show how different philosophies in web agent creation can severely impact the associated expended energy, and that more energy consumed does not necessarily equate to better results. We highlight a lack of transparency regarding disclosing model parameters and processes used for some web agents as a limiting factor when estimating energy consumption. Our work contributes towards a change in thinking of how we evaluate web agents, advocating for dedicated metrics measuring energy consumption in benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Measuring energy and CO2 costs of web agents through empirical benchmarking

Analyzing how different web agent designs impact energy consumption efficiency

Addressing lack of transparency in model parameters for energy estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking web agent energy consumption empirically

Estimating energy costs through theoretical analysis

Advocating dedicated energy metrics in evaluations

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Engineer - Agentic AI

Apple

Sunnyvale, United States of America

Research Engineer, Monetization AI