🤖 AI Summary
Neural test generation suffers from weak semantic validity due to scarce and insufficiently diverse training data—particularly for emerging programming languages. To address this, we propose FuzzAug, the first data augmentation framework for neural test generation that integrates fuzzing-inspired heuristic mutation. FuzzAug jointly optimizes input diversity and program semantic fidelity by (1) generating candidate tests via fuzzing-based mutation, (2) fine-tuning large language models (LLMs) on augmented data, (3) validating outputs via compilation, and (4) iteratively refining candidates using branch coverage feedback. This enables dynamic co-optimization between fuzzing-driven test generation and LLM-based synthesis. Empirical evaluation demonstrates that FuzzAug improves assertion accuracy by 5%, compilation success rate by over 10%, and branch coverage of generated test functions by 5 percentage points—substantially enhancing LLMs’ capability to produce high-quality, high-coverage unit tests.
📝 Abstract
Testing is essential to modern software engineering for building reliable software. Given the high costs of manually creating test cases, automated test case generation, particularly methods utilizing large language models, has become increasingly popular. These neural approaches generate semantically meaningful tests that are more maintainable compared with traditional automatic testing methods like fuzzing. However, the diversity and volume of unit tests in current datasets are limited. In this paper, we introduce a novel data augmentation technique, *FuzzAug*, that introduces the benefits of fuzzing to large language models to preserve valid program semantics and provide diverse inputs. This enhances the model's ability to embed correct inputs that can explore more branches of the function under test. Our evaluations show that models trained with dataset augmented by FuzzAug increase assertion accuracy by 5%, improve compilation rate by more than 10%, and generate unit test functions with 5% more branch coverage. This technique demonstrates the potential of using dynamic software testing to improve neural test generation, offering significant enhancements in neural test generation.