🤖 AI Summary
Existing automated test case generation methods struggle to produce highly relevant inputs, limiting testing effectiveness. This paper proposes BRMiner, a novel hybrid framework integrating large language models (LLMs) with rule- and pattern-driven techniques. BRMiner pioneers a synergistic filtering mechanism that jointly leverages LLMs, regular expression matching, and semantic parsing to precisely extract high-relevance test inputs from bug reports. Evaluated on the Defects4J benchmark, BRMiner achieves a 60.03% relevant-input rate and a 31.71% extraction accuracy—substantially outperforming LLM-only baselines. When integrated into EvoSuite, it significantly improves multi-dimensional code coverage (line, branch, and mutation). Furthermore, when incorporated into Randoop, it detects 58 previously undetected defects missed by state-of-the-art baseline tools. This work constitutes the first systematic investigation into effectively augmenting test input generation through principled collaboration between LLMs and traditional program analysis techniques.
📝 Abstract
The quality of software is closely tied to the effectiveness of the tests it undergoes. Manual test writing, though crucial for bug detection, is time-consuming, which has driven significant research into automated test case generation. However, current methods often struggle to generate relevant inputs, limiting the effectiveness of the tests produced. To address this, we introduce BRMiner, a novel approach that leverages Large Language Models (LLMs) in combination with traditional techniques to extract relevant inputs from bug reports, thereby enhancing automated test generation tools. In this study, we evaluate BRMiner using the Defects4J benchmark and test generation tools such as EvoSuite and Randoop. Our results demonstrate that BRMiner achieves a Relevant Input Rate (RIR) of 60.03% and a Relevant Input Extraction Accuracy Rate (RIEAR) of 31.71%, significantly outperforming methods that rely on LLMs alone. The integration of BRMiner's input enhances EvoSuite ability to generate more effective test, leading to increased code coverage, with gains observed in branch, instruction, method, and line coverage across multiple projects. Furthermore, BRMiner facilitated the detection of 58 unique bugs, including those that were missed by traditional baseline approaches. Overall, BRMiner's combination of LLM filtering with traditional input extraction techniques significantly improves the relevance and effectiveness of automated test generation, advancing the detection of bugs and enhancing code coverage, thereby contributing to higher-quality software development.