OpenAI for OpenAPI: Automated generation of REST API specification via LLMs

📅 2026-01-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency, error-proneness, and technology-stack dependency developers face when authoring and maintaining OpenAPI specifications (OAS). To overcome these challenges, we propose OOPS—the first language- and framework-agnostic, LLM-driven static analysis approach. OOPS constructs an API dependency graph to link cross-file code artifacts and employs a two-stage pipeline—endpoint extraction followed by OAS generation—augmented with a self-refinement mechanism to mitigate LLM context limitations and reduce hallucinations. Evaluated on 12 real-world REST APIs spanning five programming languages and eight frameworks, OOPS achieves F1 scores of 98%, 97%, and 92% in endpoint identification, request/response structure inference, and parameter constraint derivation, respectively, with average input and output token counts below 5.6K and 0.9K, substantially decreasing reliance on manual intervention and handcrafted rules.

Technology Category

Application Category

📝 Abstract
REST APIs, based on the REpresentational State Transfer (REST) architecture, are the primary type of Web API. The OpenAPI Specification (OAS) serves as the de facto standard for describing REST APIs and is crucial for multiple software engineering tasks. However, developers face challenges in writing and maintaining OAS. Although static analysis shows potential for OAS generation, it is limited to specific programming languages and development frameworks. The powerful code understanding capabilities of LLMs offer new opportunities for OAS generation, yet they are constrained by context limitations and hallucinations. To address these challenges, we propose the OpenAI OpenAPI Project Scanner (OOPS), the first technology-agnostic LLM-based static analysis method for OAS generation, requiring fewer technology-specific rules and less human expert intervention. OOPS is implemented as an LLM agent workflow comprising two key steps: endpoint method extraction and OAS generation. By constructing an API dependency graph, it establishes necessary file associations to address LLMs'context limitations. Through multi-stage generation and self-refine, it mitigates both syntactic and semantic hallucinations during OAS generation. We evaluated OOPS on 12 real-world REST APIs spanning 5 programming languages and 8 development frameworks. Experimental results demonstrate that OOPS accurately generates high-quality OAS for REST APIs implemented with diverse technologies, achieving an average F1-score exceeding 98% for endpoint method inference, 97% for both request parameter and response inference, and 92% for parameter constraint inference. The input tokens average below 5.6K with a maximum of 16.2K, while the output tokens average below 0.9K with a maximum of 7.7K.
Problem

Research questions and friction points this paper is trying to address.

REST API
OpenAPI Specification
LLM hallucination
static analysis
context limitation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based static analysis
OpenAPI Specification generation
API dependency graph
hallucination mitigation
technology-agnostic
H
Hao Chen
School of Computer Science and Engineering, Beihang University, Beijing, China; Sino-German Joint Software Institute, Beihang University, Beijing, China
Y
Yunchun Li
School of Computer Science and Engineering, Beihang University, Beijing, China; Sino-German Joint Software Institute, Beihang University, Beijing, China
Chen Chen
Chen Chen
Hangzhou Innovation Institute, Beihanga University
medical imagerobot navigationembodied intelligencedomain knowledgedeep learning
F
Fengxu Lin
School of Computer Science and Engineering, Beihang University, Beijing, China; Sino-German Joint Software Institute, Beihang University, Beijing, China
Wei Li
Wei Li
Institute of Computing Technology, Chinese Academy of Sciences
computer