AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

While current AI-generated code often satisfies functional correctness, it frequently neglects long-term maintainability, leading to accumulating technical debt and architectural degradation. This work proposes a multi-scale analysis framework that integrates architectural evaluation metrics with systematic technical debt auditing, spanning from single-file tasks to complex systems generated by multiple agents. The study finds that neither functional correctness nor refined prompting mitigates architectural decay. It further uncovers, for the first time, an inverse relationship between code volume and structural quality—termed the “volume–quality inverse law”—where code size alone can nearly perfectly predict the degree of structural degradation. By reframing the core challenge of AI-driven software engineering from code generation to architectural complexity management, this research underscores the necessity of endowing AI agents with explicit architectural foresight to ensure maintainability.

📝 Abstract

The promise of Large Language Models in automated software engineering is often measured by functional correctness, overlooking the critical issue of long term maintainability. This paper presents a systematic audit of technical debt in AI-generated software, revealing that AI does not eliminate flaws but rather introduces a distinct machine signature of defects. Our multi-scale analysis, spanning single-file algorithmic tasks and complex, agent generated systems, identifies a fundamental Reasoning-Complexity Trade-off: as models become more capable, they generate increasingly bloated and coupled code. This architectural decay is so pronounced that we establish a Volume-Quality Inverse Law, where code volume is a near perfect predictor of structural degradation. Crucially, we demonstrate that neither functional correctness nor detailed prompting mitigates this decay. These findings challenge the current paradigm of prompt-driven generation, reframing the central problem of AI-based software engineering from one of code generation to one of architectural complexity management. We conclude that future progress depends on equipping agents with explicit architectural foresight to ensure the software they build is not just functional, but also maintainable.

Problem

Research questions and friction points this paper is trying to address.

technical debt

AI-generated code

architectural complexity

maintainability

code quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

technical debt

Reasoning-Complexity Trade-off

Volume-Quality Inverse Law