You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models

📅 2024-02-07

🏛️ arXiv.org

📈 Citations: 11

✨ Influential: 0

career value

139K/year

🤖 AI Summary

REST API documentation frequently suffers from incompleteness, obsolescence, or inaccessibility, hindering both automated testing efficiency and human comprehension. This paper introduces the first LLM-driven, end-to-end framework for OpenAPI specification inference and black-box API testing—requiring only an API name and an LLM API key. It automatically generates and mutates HTTP requests, then infers specifications and detects defects via response analysis. A novel context-aware prompt masking strategy enables zero-shot discovery of undocumented routes and parameters without model fine-tuning. Evaluated on a standardized benchmark, the framework achieves 85.05% average recall for GET routes and 81.05% for query parameters, successfully uncovering hidden endpoints and diverse server-side errors (e.g., 5xx, logic flaws). The inferred OpenAPI specifications are directly compatible with mainstream API testing tools, enabling seamless integration into existing CI/CD and security validation pipelines.

Technology Category

Application Category

📝 Abstract

RESTful APIs are popular web services, requiring documentation to ease their comprehension, reusability and testing practices. The OpenAPI Specification (OAS) is a widely adopted and machine-readable format used to document such APIs. However, manually documenting RESTful APIs is a time-consuming and error-prone task, resulting in unavailable, incomplete, or imprecise documentation. As RESTful API testing tools require an OpenAPI specification as input, insufficient or informal documentation hampers testing quality. Recently, Large Language Models (LLMs) have demonstrated exceptional abilities to automate tasks based on their colossal training data. Accordingly, such capabilities could be utilized to assist the documentation and testing process of RESTful APIs. In this paper, we present RESTSpecIT, the first automated RESTful API specification inference and black-box testing approach leveraging LLMs. The approach requires minimal user input compared to state-of-the-art RESTful API inference and testing tools; Given an API name and an LLM key, HTTP requests are generated and mutated with data returned by the LLM. By sending the requests to the API endpoint, HTTP responses can be analyzed for inference and testing purposes. RESTSpecIT utilizes an in-context prompt masking strategy, requiring no model fine-tuning. Our evaluation demonstrates that RESTSpecIT is capable of: (1) inferring specifications with 85.05% of GET routes and 81.05% of query parameters found on average, (2) discovering undocumented and valid routes and parameters, and (3) uncovering server errors in RESTful APIs. Inferred specifications can also be used as testing tool inputs.

Problem

Research questions and friction points this paper is trying to address.

Automate REST API documentation to reduce errors and save time

Enhance API testing efficiency with minimal user input required

Infer and validate API routes and parameters using LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-assisted automated REST API documentation

Black-box testing via request mutations

No prior model fine-tuning required

🔎 Similar Papers

No similar papers found.