Large Language Models for Analyzing Enterprise Architecture Debt in Unstructured Documentation

📅 2026-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automatically identifying early indicators of enterprise architecture (EA) debt—known as EA smells—from unstructured textual documents, a task that existing approaches struggle with due to their reliance on manual analysis or applicability only to structured data. Grounded in the design science research paradigm, this work pioneers the application of large language models (LLMs) to detect and quantify EA smells in unstructured enterprise documentation. By fine-tuning both locally deployed LLMs and customized GPT-based models, the proposed system efficiently processes diverse text types such as process descriptions and strategic documents while preserving data privacy. Experimental results demonstrate that the GPT-based model achieves superior accuracy and speed, whereas the local model offers enhanced security; both effectively identify multiple predefined EA smells in synthetic yet realistic business documents, thereby advancing the automation of enterprise architecture governance.
📝 Abstract
Enterprise Architecture Debt (EA Debt) arises from suboptimal design decisions and misaligned components that can degrade an organization's IT landscape over time. Early indicators, Enterprise Architecture Smells (EA Smells), are currently mainly detected manually or only from structured artifacts, leaving much unstructured documentation under-analyzed. This study proposes an approach using a large language model (LLM) to identify and quantify EA Debt in unstructured architectural documentation. Following a design science research approach, we design and evaluate an LLM-based prototype for automated EA Smell detection. The artifact ingests unstructured documents (e.g., process descriptions, strategy papers), applies fine-tuned detection models, and outputs identified smells. We evaluate the prototype through a case study using synthetic yet realistic business documents, benchmarking against a custom GPT-based model. Results show that LLMs can detect multiple predefined EA Smells in unstructured text, with the benchmark model achieving higher precision and processing speed, and the fine-tuned on-premise model offering data protection advantages. The findings highlight opportunities for integrating LLM-based smell detection into EA governance practice.
Problem

Research questions and friction points this paper is trying to address.

Enterprise Architecture Debt
Enterprise Architecture Smells
Unstructured Documentation
Large Language Models
Architecture Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Enterprise Architecture Debt
Unstructured Documentation
EA Smells
Design Science Research
🔎 Similar Papers
No similar papers found.
C
Christin Pagels
Stockholm University
Simon Hacks
Simon Hacks
PRECIS group, Stockholm University
Enterprise Architecture ManagementThreat ModellingAttack Simulations
R
Rob Henk Bemthuis
University of Twente