Can AI Agents Generate Microservices? How Far are We?

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This study presents the first systematic evaluation of large language model (LLM) agents in automatically generating microservices with explicit dependencies and API contracts. Through 144 experiments, we compare three LLM agents, two prompting strategies, and two generation scenarios—incremental versus from-scratch—using unit and integration tests to assess functional correctness, code quality, and efficiency. Results show that incremental generation achieves unit test pass rates of 50–76%, while clean-state generation yields integration test pass rates of 81–98%. Although the generated code exhibits lower complexity than human-written baselines, its correctness remains inconsistent, necessitating human oversight. The findings further reveal that both prompting strategy and generation scenario significantly influence adherence to API contracts and overall code quality.

Technology Category

Application Category

📝 Abstract

LLMs have advanced code generation, but their use for generating microservices with explicit dependencies and API contracts remains understudied. We examine whether AI agents can generate functional microservices and how different forms of contextual information influence their performance. We assess 144 generated microservices across 3 agents, 4 projects, 2 prompting strategies, and 2 scenarios. Incremental generation operates within existing systems and is evaluated with unit tests. Clean state generation starts from requirements alone and is evaluated with integration tests. We analyze functional correctness, code quality, and efficiency. Minimal prompts outperformed detailed ones in incremental generation, with 50-76% unit test pass rates. Clean state generation produced higher integration test pass rates (81-98%), indicating strong API contract adherence. Generated code showed lower complexity than human baselines. Generation times varied widely across agents, averaging 6-16 minutes per service. AI agents can produce microservices with maintainable code, yet inconsistent correctness and reliance on human oversight show that fully autonomous microservice generation is not yet achievable.

Problem

Research questions and friction points this paper is trying to address.

AI Agents

Microservices

Code Generation

API Contracts

LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI agents

microservice generation

LLM prompting strategies