🤖 AI Summary
LLMs powering code-generation agents face significant challenges in industrial deployment—namely, high energy consumption (exceeding 100K tokens per inference) and poor scalability—making it difficult to simultaneously achieve operational efficiency (“green agents”) and high-quality code generation (“green code”). This work is the first to systematically investigate the Pareto trade-off between these competing objectives. We propose a multi-objective configuration optimization framework that uses temperature as the primary control variable to jointly tune hyperparameters and prompt templates. Evaluated on the SWE-Perf benchmark, our framework reduces latency by 37.7%, improves code correctness, and achieves a 135× gain in hypervolume—a metric quantifying Pareto-front improvement. Key contributions include: (i) uncovering temperature’s pivotal role in balancing efficiency and quality; (ii) delivering the first practically deployable strategy for green coding agents; and (iii) establishing a new paradigm for sustainable AI engineering.
📝 Abstract
Coding agents powered by LLMs face critical sustainability and scalability challenges in industrial deployment, with single runs consuming over 100k tokens and incurring environmental costs that may exceed optimization benefits. This paper introduces GA4GC, the first framework to systematically optimize coding agent runtime (greener agent) and code performance (greener code) trade-offs by discovering Pareto-optimal agent hyperparameters and prompt templates. Evaluation on the SWE-Perf benchmark demonstrates up to 135x hypervolume improvement, reducing agent runtime by 37.7% while improving correctness. Our findings establish temperature as the most critical hyperparameter, and provide actionable strategies to balance agent sustainability with code optimization effectiveness in industrial deployment.