🤖 AI Summary
This work addresses the limited effectiveness and diversity of generative test inputs in boundary testing of deep learning image classifiers by proposing a latent-space truncation regularization method based on StyleGAN. By integrating latent code mixing with a bisection-search-optimized truncation strategy, the approach efficiently generates high-quality and diverse boundary test samples in the latent space. Experimental results on MNIST, Fashion-MNIST, and CIFAR-10 demonstrate that the proposed method significantly outperforms random truncation, achieving higher fault detection rates while simultaneously enhancing both the validity and diversity of the generated test inputs.
📝 Abstract
This study investigates the impact of regularization of latent spaces through truncation on the quality of generated test inputs for deep learning classifiers. We evaluate this effect using style-based GANs, a state-of-the-art generative approach, and assess quality along three dimensions: validity, diversity, and fault detection. We evaluate our approach on the boundary testing of deep learning image classifiers across three datasets, MNIST, Fashion MNIST, and CIFAR-10. We compare two truncation strategies: latent code mixing with binary search optimization and random latent truncation for generative exploration. Our experiments show that the latent code-mixing approach yields a higher fault detection rate than random truncation, while also improving both diversity and validity.