ARC: Compiling Hundreds of Requirement Scenarios into A Runnable Web System

📅 2026-02-14

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

This work addresses the challenge that large language models often produce incomplete or incorrect code when handling complex documents containing hundreds of multimodal requirements, hindering the construction of executable systems. To overcome this, the authors propose Agentic Requirement Compilation (ARC), a novel “requirement compilation” paradigm that leverages a bidirectional, test-driven multi-agent architecture: it decomposes requirements top-down into verifiable interfaces and synthesizes full-stack code bottom-up that passes rigorous tests. ARC directly compiles multimodal domain-specific language (DSL) specifications into complete web systems—including UI, APIs, databases, test suites, and strict requirement-to-code traceability. Experiments show that ARC improves GUI test pass rates by 50.6% on average across six systems with 50–200 scenarios. In a user study, 21 participants developed a 10k-line ticketing system from DSL specifications in just 5.6 hours, demonstrating ARC’s effectiveness and practicality.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have improved programming efficiency, but their performance degrades significantly as requirements scale; when faced with multi-modal documents containing hundreds of scenarios, LLMs often produce incorrect implementations or omit constraints. We propose Agentic Requirement Compilation (ARC), a technique that moves beyond simple code generation to requirement compilation, enabling the creation of runnable web systems directly from multi-modal DSL documents. ARC generates not only source code but also modular designs for UI, API, and database layers, enriched test suites (unit, modular, and integration), and detailed traceability for software maintenance. Our approach employs a bidirectional test-driven agentic loop: a top-down architecture phase decomposes requirements into verifiable interfaces, followed by a bottom-up implementation phase where agents generate code to satisfy those tests. ARC maintains strict traceability across requirements, design, and code to facilitate intelligent asset reuse. We evaluated ARC by generating six runnable web systems from documents spanning 50-200 multi-modal scenarios. Compared to state-of-the-art baselines, ARC-generated systems pass 50.6% more GUI tests on average. A user study with 21 participants showed that novice users can successfully write DSL documents for complex systems, such as a 10K-line ticket-booking system, in an average of 5.6 hours. These results demonstrate that ARC effectively transforms non-trivial requirement specifications into maintainable, runnable software.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Requirement Scenarios

Multi-modal Documents

Runnable Web System

Software Compilation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic Requirement Compilation

Requirement-to-Code Compilation

Test-Driven Agent Loop