FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the limitations of existing code agents, which are largely confined to front-end generation and struggle to support the data flow orchestration, dependency management, and complex debugging required for production-grade full-stack applications. To overcome these challenges, we propose FullStack-Dev, a multi-agent framework that enables end-to-end full-stack development through coordinated planning, coding, navigation, and debugging. We further introduce FullStack-Learn, a self-improvement mechanism leveraging repository back-translation, and present FullStack-Bench—the first comprehensive benchmark spanning front-end, back-end, and database tasks. Experimental results demonstrate that our approach outperforms state-of-the-art methods by 8.7%, 38.2%, and 15.9% on front-end, back-end, and database tasks, respectively, while the self-improvement strategy boosts the performance of a 30B-parameter model by up to 9.7%.

Technology Category

Application Category

📝 Abstract

Assisting non-expert users to develop complex interactive websites has become a popular task for LLM-powered code agents. However, existing code agents tend to only generate frontend web pages, masking the lack of real full-stack data processing and storage with fancy visual effects. Notably, constructing production-level full-stack web applications is far more challenging than only generating frontend web pages, demanding careful control of data flow, comprehensive understanding of constantly updating packages and dependencies, and accurate localization of obscure bugs in the codebase. To address these difficulties, we introduce FullStack-Agent, a unified agent system for full-stack agentic coding that consists of three parts: (1) FullStack-Dev, a multi-agent framework with strong planning, code editing, codebase navigation, and bug localization abilities. (2) FullStack-Learn, an innovative data-scaling and self-improving method that back-translates crawled and synthesized website repositories to improve the backbone LLM of FullStack-Dev. (3) FullStack-Bench, a comprehensive benchmark that systematically tests the frontend, backend and database functionalities of the generated website. Our FullStack-Dev outperforms the previous state-of-the-art method by 8.7%, 38.2%, and 15.9% on the frontend, backend, and database test cases respectively. Additionally, FullStack-Learn raises the performance of a 30B model by 9.7%, 9.5%, and 2.8% on the three sets of test cases through self-improvement, demonstrating the effectiveness of our approach. The code is released at https://github.com/mnluzimu/FullStack-Agent.

Problem

Research questions and friction points this paper is trying to address.

full-stack web development

code generation

LLM-powered agents

production-level applications

bug localization

Innovation

Methods, ideas, or system contributions that make the work stand out.

FullStack-Agent

development-oriented testing

repository back-translation