š¤ AI Summary
A systematic understanding of large language modelsā (LLMs) applications in software architecture remains absent. Method: We conduct a systematic literature review (SLR), analyzing 18 primary studies to construct the first comprehensive application taxonomy of LLMs in software architectureāspanning six core tasks: design decision classification, architectural pattern recognition, architecture generation, cloud-native support, consistency verification, and documentation synthesis. Contribution/Results: Our analysis reveals critical research gapsāparticularly in cloud-native architecture support and architectural consistency verificationāand identifies an evolutionary trend from simple prompting toward integrated enhancement techniques (e.g., retrieval-augmented generation and multi-agent collaboration). Empirical findings show LLMs significantly outperform traditional baselines across most architectural tasks. We identify three persistent technical challengesālimited interpretability, inadequate contextual modeling, and lack of standardized evaluationāand four emerging enhancement strategies. This work provides an evidence-based foundation and a concrete technical roadmap for future research in LLM-driven software architecture.
š Abstract
Large Language Models (LLMs) are used for many different software engineering tasks. In software architecture, they have been applied to tasks such as classification of design decisions, detection of design patterns, and generation of software architecture design from requirements. However, there is little overview on how well they work, what challenges exist, and what open problems remain. In this paper, we present a systematic literature review on the use of LLMs in software architecture. We analyze 18 research articles to answer five research questions, such as which software architecture tasks LLMs are used for, how much automation they provide, which models and techniques are used, and how these approaches are evaluated. Our findings show that while LLMs are increasingly applied to a variety of software architecture tasks and often outperform baselines, some areas, such as generating source code from architectural design, cloud-native computing and architecture, and checking conformance remain underexplored. Although current approaches mostly use simple prompting techniques, we identify a growing research interest in refining LLM-based approaches by integrating advanced techniques.