๐ค AI Summary
Software architecture design is highly expertise-dependent, facing challenges including knowledge intensity, complex trade-offs, and frequent iterations. Existing LLM applications predominantly target isolated tasks and lack support for end-to-end architectural decision-making with integrated quality attribute modeling.
Method: We propose MAAD, a knowledge-driven, four-role multi-agent architecture design framework. MAAD integrates modules for requirement parsing, architectural modeling, design reasoning, and quality assessment, and supports heterogeneous LLMsโincluding GPT-4o, DeepSeek-R1, and Llama 3.3.
Contribution/Results: MAAD pioneers domain-knowledge-guided automated architecture blueprint generation and embeds a structured quality attribute evaluation mechanism. In industrial case studies, MAAD achieves significantly higher component completeness and deeper evaluative insights than MetaGPT, validated across 11 real-world requirements. Results demonstrate comprehensive improvements in design efficiency, solution diversity, and decision reliability.
๐ Abstract
Software architecture design is a critical, yet inherently complex and knowledge-intensive phase of software development. It requires deep domain expertise, development experience, architectural knowledge, careful trade-offs among competing quality attributes, and the ability to adapt to evolving requirements. Traditionally, this process is time-consuming and labor-intensive, and relies heavily on architects, often resulting in limited design alternatives, especially under the pressures of agile development. While Large Language Model (LLM)-based agents have shown promising performance across various SE tasks, their application to architecture design remains relatively scarce and requires more exploration, particularly in light of diverse domain knowledge and complex decision-making. To address the challenges, we proposed MAAD (Multi-Agent Architecture Design), an automated framework that employs a knowledge-driven Multi-Agent System (MAS) for architecture design. MAAD orchestrates four specialized agents (i.e., Analyst, Modeler, Designer and Evaluator) to collaboratively interpret requirements specifications and produce architectural blueprints enriched with quality attributes-based evaluation reports. We then evaluated MAAD through a case study and comparative experiments against MetaGPT, a state-of-the-art MAS baseline. Our results show that MAAD's superiority lies in generating comprehensive architectural components and delivering insightful and structured architecture evaluation reports. Feedback from industrial architects across 11 requirements specifications further reinforces MAAD's practical usability. We finally explored the performance of the MAAD framework with three LLMs (GPT-4o, DeepSeek-R1, and Llama 3.3) and found that GPT-4o exhibits better performance in producing architecture design, emphasizing the importance of LLM selection in MAS-driven architecture design.