🤖 AI Summary
Despite rapid advances in large language models (LLMs), practical deployment of LLM-based agents remains challenging—not due to insufficient model capability, but because of a severe imbalance between value (e.g., information quality, task completion rate) and cost (e.g., latency, API expenses, reasoning steps).
Method: This paper introduces “Agentic Return on Investment (ROI)” as the first unified evaluation paradigm for LLM agents, moving beyond purely performance-oriented metrics. We propose a three-dimensional analytical framework and identify a zigzag optimization strategy—“scale up first to enhance quality, then scale down to reduce cost”—integrating utility analysis, quality assessment, cost modeling, and step-wise constraints.
Contribution/Results: The framework enables cross-dimensional, quantitative evaluation of agent efficacy and provides a scalable, low-cost, high-availability roadmap for engineering LLM agents into production systems.
📝 Abstract
Large Language Model (LLM) agents represent a promising shift in human-AI interaction, moving beyond passive prompt-response systems to autonomous agents capable of reasoning, planning, and goal-directed action. Despite the widespread application in specialized, high-effort tasks like coding and scientific research, we highlight a critical usability gap in high-demand, mass-market applications. This position paper argues that the limited real-world adoption of LLM agents stems not only from gaps in model capabilities, but also from a fundamental tradeoff between the value an agent can provide and the costs incurred during real-world use. Hence, we call for a shift from solely optimizing model performance to a broader, utility-driven perspective: evaluating agents through the lens of the overall agentic return on investment (Agent ROI). By identifying key factors that determine Agentic ROI--information quality, agent time, and cost--we posit a zigzag development trajectory in optimizing agentic ROI: first scaling up to improve the information quality, then scaling down to minimize the time and cost. We outline the roadmap across different development stages to bridge the current usability gaps, aiming to make LLM agents truly scalable, accessible, and effective in real-world contexts.