Multi-Agent Code-Orchestrated Generation for Reliable Infrastructure-as-Code

📅 2025-10-04

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Large language models (LLMs) often generate syntactically invalid, policy-violating, or non-scalable Infrastructure-as-Code (IaC) configurations in a single-shot manner, especially for cloud-native Terraform deployments. Method: We propose MACOG, a multi-agent collaborative framework that enables reliable, policy-compliant Terraform code generation. MACOG employs a modular agent architecture coordinated via a shared blackboard and finite-state machine, integrating deployment feedback, constrained decoding, retrieval-augmented generation (RAG), and Open Policy Agent (OPA)-based policy validation into a closed-loop optimization pipeline. Contribution/Results: On the IaC-Eval benchmark, MACOG significantly outperforms single-agent baselines (74.02 points with GPT-5; 60.13 points with Gemini-2.5 Pro). Ablation studies confirm the critical roles of multi-agent collaboration, constrained decoding, and policy-driven feedback in achieving robust, compliant IaC generation.

Technology Category

Application Category

📝 Abstract

The increasing complexity of cloud-native infrastructure has made Infrastructure-as-Code (IaC) essential for reproducible and scalable deployments. While large language models (LLMs) have shown promise in generating IaC snippets from natural language prompts, their monolithic, single-pass generation approach often results in syntactic errors, policy violations, and unscalable designs. In this paper, we propose MACOG (Multi-Agent Code-Orchestrated Generation), a novel multi-agent LLM-based architecture for IaC generation that decomposes the task into modular subtasks handled by specialized agents: Architect, Provider Harmonizer, Engineer, Reviewer, Security Prover, Cost and Capacity Planner, DevOps, and Memory Curator. The agents interact via a shared-blackboard, finite-state orchestrator layer, and collectively produce Terraform configurations that are not only syntactically valid but also policy-compliant and semantically coherent. To ensure infrastructure correctness and governance, we incorporate Terraform Plan for execution validation and Open Policy Agent (OPA) for customizable policy enforcement. We evaluate MACOG using the IaC-Eval benchmark, where MACOG is the top enhancement across models, e.g., GPT-5 improves from 54.90 (RAG) to 74.02 and Gemini-2.5 Pro from 43.56 to 60.13, with concurrent gains on BLEU, CodeBERTScore, and an LLM-judge metric. Ablations show constrained decoding and deploy feedback are critical: removing them drops IaC-Eval to 64.89 and 56.93, respectively.

Problem

Research questions and friction points this paper is trying to address.

Addresses unreliable Infrastructure-as-Code generation from natural language

Solves syntactic errors and policy violations in monolithic LLM approaches

Ensures scalable, compliant Terraform configurations via multi-agent collaboration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent architecture decomposes IaC generation into specialized tasks

Shared-blackboard orchestrator coordinates modular agent interactions

Terraform Plan and OPA integration ensures policy compliance

🔎 Similar Papers

No similar papers found.