🤖 AI Summary
To address the labor-intensive, error-prone, and maintenance-challenging nature of manually authoring RESTful API specifications (e.g., OpenAPI), this paper proposes the first large language model (LLM)-based approach for automatic OpenAPI specification generation. Unlike prior methods, it requires neither comprehensive code annotations nor structured code representations; instead, it jointly leverages code understanding, RESTful semantic parsing, and structured text generation to directly infer accurate and complete interface descriptions from raw source code. Crucially, the method is robust to real-world development imperfections—such as incomplete implementations and missing comments—without requiring manual intervention. Evaluated on 20 real-world API projects, our approach achieves an average 48.85% improvement in coverage of developer-omitted entities compared to state-of-the-art tools, demonstrating substantial gains in both accuracy and practical applicability.
📝 Abstract
REpresentation State Transfer (REST) is an architectural style for designing web applications that enable scalable, stateless communication between clients and servers via common HTTP techniques. Web APIs that employ the REST style are known as RESTful (or REST) APIs. When using or testing a RESTful API, developers may need to employ its specification, which is often defined by open-source standards such as the OpenAPI Specification (OAS). However, it can be very time-consuming and error-prone to write and update these specifications, which may negatively impact the use of RESTful APIs, especially when the software requirements change. Many tools and methods have been proposed to solve this problem, such as Respector and Swagger Core. OAS generation can be regarded as a common text-generation task that creates a formal description of API endpoints derived from the source code. A potential solution for this may involve using Large Language Models (LLMs), which have strong capabilities in both code understanding and text generation. Motivated by this, we propose a novel approach for generating the OASs of RESTful APIs using LLMs: LLM-based RESTful API-Specification Generation (LRASGen). To the best of our knowledge, this is the first use of LLMs and API source code to generate OASs for RESTful APIs. Compared with existing tools and methods, LRASGen can generate the OASs, even when the implementation is incomplete (with partial code, and/or missing annotations/comments, etc.). To evaluate the LRASGen performance, we conducted a series of empirical studies on 20 real-world RESTful APIs. The results show that two LLMs (GPT-4o mini and DeepSeek V3) can both support LARSGen to generate accurate specifications, and LRASGen-generated specifications cover an average of 48.85% more missed entities than the developer-provided specifications.