🤖 AI Summary
This work addresses a critical limitation in existing large language model (LLM)-based unit test generation approaches, which often overlook the mocking information embedded in developer-written tests. To bridge this gap, we propose MOCKMILL, the first method that leverages mocking details as guiding signals to automatically extract and utilize such information for directing LLMs to generate high-quality test cases targeting mocked components. MOCKMILL integrates an iterative generation-and-repair mechanism to ensure the executability of the produced tests. Empirical evaluation on ten classes across six Java projects demonstrates that MOCKMILL substantially improves code coverage, effectively supplements code lines missed by both baseline techniques and existing tests, and kills a significantly higher number of mutants.
📝 Abstract
Large Language Models (LLMs) have recently shown strong potential for automated unit test generation. This has motivated us to investigate whether developer-defined test doubles (commonly referred to as mocks) available in existing test suites can be leveraged to improve LLM-driven test generation. To this end, we propose MOCKMILL, an LLM-based technique and tool that generates test cases by exploiting mocking information automatically extracted from developer-written tests. MOCKMILL targets components that are replaced by test doubles in existing tests and uses the encoded stubbings and interaction expectations to guide test generation, combined with an iterative generation-and-repair process to ensure executable tests. We evaluated MOCKMILL on 10 open-source classes from six Java projects using four LLMs, and compared the generated tests with existing project tests and tests produced by baseline approaches. The results show that MOCKMILL's tests cover lines of code and kill mutants that existing tests and baseline-generated tests miss. Overall, our findings provide preliminary evidence that leveraging mocking information is a complementary and effective way to enhance LLM-based test generation.