From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future

📅 2024-08-05

🏛️ arXiv.org

📈 Citations: 17

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Large language models (LLMs) and LLM-based agents suffer from conceptual ambiguity and a lack of standardized evaluation criteria in software engineering. Method: This work systematically delineates their capability boundaries and fundamental distinctions across six task dimensions—requirements engineering, code generation, testing, debugging, maintenance, and autonomous decision-making—via systematic literature review, cross-task comparative analysis, and meta-analysis of existing benchmarks and metrics. Contribution/Results: We propose the first taxonomy unifying LLMs and agent-based approaches across these six software engineering tasks and introduce an initial standardized evaluation framework for agent capabilities in this domain. Key challenges—including insufficient autonomy and fragmented, non-comparable benchmarks—are identified. The study establishes a theoretical foundation and evolutionary roadmap for AGI-driven software engineering agents.

Technology Category

Application Category

📝 Abstract

With the rise of large language models (LLMs), researchers are increasingly exploring their applications in var ious vertical domains, such as software engineering. LLMs have achieved remarkable success in areas including code generation and vulnerability detection. However, they also exhibit numerous limitations and shortcomings. LLM-based agents, a novel tech nology with the potential for Artificial General Intelligence (AGI), combine LLMs as the core for decision-making and action-taking, addressing some of the inherent limitations of LLMs such as lack of autonomy and self-improvement. Despite numerous studies and surveys exploring the possibility of using LLMs in software engineering, it lacks a clear distinction between LLMs and LLM based agents. It is still in its early stage for a unified standard and benchmarking to qualify an LLM solution as an LLM-based agent in its domain. In this survey, we broadly investigate the current practice and solutions for LLMs and LLM-based agents for software engineering. In particular we summarise six key topics: requirement engineering, code generation, autonomous decision-making, software design, test generation, and software maintenance. We review and differentiate the work of LLMs and LLM-based agents from these six topics, examining their differences and similarities in tasks, benchmarks, and evaluation metrics. Finally, we discuss the models and benchmarks used, providing a comprehensive analysis of their applications and effectiveness in software engineering. We anticipate this work will shed some lights on pushing the boundaries of LLM-based agents in software engineering for future research.

Problem

Research questions and friction points this paper is trying to address.

Distinguishing LLMs from LLM-based agents in software engineering

Addressing LLM limitations like autonomy and self-improvement

Surveying LLM and agent applications in six software engineering topics

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agents enhance autonomy and self-improvement

Survey covers six key software engineering topics

Differentiates LLMs and agents in tasks and benchmarks

🔎 Similar Papers

Large Language Model-Based Agents for Software Engineering: A Survey