🤖 AI Summary
To address insufficient boundary-value coverage and poor test-case readability in REST API automated testing, this paper proposes an API test-case augmentation method that leverages off-the-shelf commercial large language models (LLMs)—specifically ChatGPT and GitHub Copilot—without fine-tuning. Our approach employs test-quality-oriented prompt engineering to systematically validate LLMs’ effectiveness in generating protocol-compliant boundary values, yielding 12 reusable, empirically grounded prompt design principles. Experimental evaluation demonstrates that the generated tests significantly improve path and parameter boundary coverage while maintaining high semantic clarity and human interpretability. This work constitutes the first systematic empirical validation of zero-shot LLMs for practical API test augmentation, establishing a novel paradigm for low-barrier, high-quality API testing.
📝 Abstract
REST APIs are an indispensable building block in today's cloud-native applications, so testing them is critically important. However, writing automated tests for such REST APIs is challenging because one needs strong and readable tests that exercise the boundary values of the protocol embedded in the REST API. In this paper, we report our experience with using"out of the box"large language models (ChatGPT and GitHub's Copilot) to amplify REST API test suites. We compare the resulting tests based on coverage and understandability, and we derive a series of guidelines and lessons learned concerning the prompts that result in the strongest test suite.