π€ AI Summary
To address the limitations of existing automation techniques in accuracy and efficiency for web end-to-end (E2E) testing, this paper proposes a novel feature-driven E2E test generation paradigm: leveraging large language models (LLMs) to automatically identify functional features of websites and generate semantically coherent, executable test cases. Our key contributions are threefold: (1) the first feature-driven test generation framework grounded in functional semantic reasoning; (2) E2EBenchβthe first benchmark explicitly designed for functional coverage evaluation; and (3) achieving 79% average feature coverage on E2EBench, outperforming the strongest baseline by 558%, thereby significantly enhancing both test completeness and semantic fidelity.
π Abstract
End-to-end (E2E) testing is essential for ensuring web application quality. However, manual test creation is time-consuming, and current test generation techniques produce incoherent tests. In this paper, we present AutoE2E, a novel approach that leverages Large Language Models (LLMs) to automate the generation of semantically meaningful feature-driven E2E test cases for web applications. AutoE2E intelligently infers potential features within a web application and translates them into executable test scenarios. Furthermore, we address a critical gap in the research community by introducing E2EBench, a new benchmark for automatically assessing the feature coverage of E2E test suites. Our evaluation on E2EBench demonstrates that AutoE2E achieves an average feature coverage of 79%, outperforming the best baseline by 558%, highlighting its effectiveness in generating high-quality, comprehensive test cases.