SchemaAgent: A Multi-Agents Framework for Generating Relational Database Schema

📅 2025-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automatically translating natural language requirements into relational database schemas remains challenging due to reliance on domain expertise, low accuracy, and poor generalization in existing approaches. Method: This paper introduces RSchema—the first large language model (LLM)-based multi-agent framework for schema generation—featuring a novel “reflection–quality assurance” dual-role collaboration mechanism. It integrates specialized role division, cross-stage error detection, and structured correction techniques. Contribution/Results: Evaluated on the newly constructed RSchema benchmark (500+ high-quality requirement-schema pairs), our method significantly outperforms state-of-the-art LLMs and conventional methods: schema accuracy and completeness improve by 28.6% and 34.1%, respectively. RSchema achieves, for the first time, end-to-end, high-fidelity, and interpretable relational schema generation without manual intervention.

Technology Category

Application Category

📝 Abstract
The relational database design would output a schema based on user's requirements, which defines table structures and their interrelated relations. Translating requirements into accurate schema involves several non-trivial subtasks demanding both database expertise and domain-specific knowledge. This poses unique challenges for automated design of relational databases. Existing efforts are mostly based on customized rules or conventional deep learning models, often producing suboptimal schema. Recently, large language models (LLMs) have significantly advanced intelligent application development across various domains. In this paper, we propose SchemaAgent, a unified LLM-based multi-agent framework for the automated generation of high-quality database schema. SchemaAgent is the first to apply LLMs for schema generation, which emulates the workflow of manual schema design by assigning specialized roles to agents and enabling effective collaboration to refine their respective subtasks. Schema generation is a streamlined workflow, where directly applying the multi-agent framework may cause compounding impact of errors. To address this, we incorporate dedicated roles for reflection and inspection, alongside an innovative error detection and correction mechanism to identify and rectify issues across various phases. For evaluation, we present a benchmark named extit{RSchema}, which contains more than 500 pairs of requirement description and schema. Experimental results on this benchmark demonstrate the superiority of our approach over mainstream LLMs for relational database schema generation.
Problem

Research questions and friction points this paper is trying to address.

Automating relational database schema design from user requirements
Overcoming limitations of rule-based and conventional deep learning methods
Ensuring accuracy in multi-agent collaboration for schema generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based multi-agent framework for schema generation
Specialized roles and collaboration for subtasks refinement
Error detection and correction mechanism across phases
🔎 Similar Papers
No similar papers found.
Qin Wang
Qin Wang
ETH Zurich
Domain AdaptationComputer Vision
Youhuan Li
Youhuan Li
Hunan University
AI/LLM for DB
Yansong Feng
Yansong Feng
Peking University
Natural Language ProcessingPattern Recognition
S
Si Chen
College of Computer Science and Electronic Engineering, Hunan University
Z
Ziming Li
College of Computer Science and Electronic Engineering, Hunan University
P
Pan Zhang
College of Computer Science and Electronic Engineering, Hunan University
Zhichao Shi
Zhichao Shi
School of Advanced Interdisciplinary; Institute of Computing Technology, Chinese Academy of Sciences
Y
Yuequn Dou
College of Computer Science and Electronic Engineering, Hunan University
C
chuchu Gao
College of Computer Science and Electronic Engineering, Hunan University
Z
Zebin Huang
College of Computer Science and Electronic Engineering, Hunan University
Z
Zihui Si
College of Computer Science and Electronic Engineering, Hunan University
Yixuan Chen
Yixuan Chen
Oxford Suzhou Center for Advanced Research
DisentanglementVision-Language ModelAI for Medical
Z
Zhaohai Sun
College of Computer Science and Electronic Engineering, Hunan University
K
Ke Tang
College of Computer Science and Electronic Engineering, Hunan University
Wenqiang Jin
Wenqiang Jin
College of Computer Science and Electronic Engineering, Hunan University