Test Input Validation for Vision-based DL Systems: An Active Learning Approach

📅 2025-01-03

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

To address evaluation bias in visual deep learning system testing caused by synthetic inputs that suffer from distortion and deviate from real-world scenarios, this paper proposes an automated test input validity verification framework based on active learning. Our method innovatively integrates multi-scale image similarity metrics—namely, SSIM, LPIPS, and feature-space distance—into a composite multi-image comparison mechanism, and introduces an iterative human-in-the-loop annotation strategy to achieve practical trade-offs between accuracy and annotation cost. Evaluated on industrial and public benchmarks, the framework achieves an average validation accuracy of 97%. Multi-metric fusion improves accuracy by ≥5.4%, while the active learning loop yields an additional 7.5% gain; overall, our approach surpasses state-of-the-art methods by at least 12.9% in average accuracy. The solution has been validated in industry practice and demonstrates strong engineering deployability.

Technology Category

Application Category

📝 Abstract

Testing deep learning (DL) systems requires extensive and diverse, yet valid, test inputs. While synthetic test input generation methods, such as metamorphic testing, are widely used for DL testing, they risk introducing invalid inputs that do not accurately reflect real-world scenarios. Invalid test inputs can lead to misleading results. Hence, there is a need for automated validation of test inputs to ensure effective assessment of DL systems. In this paper, we propose a test input validation approach for vision-based DL systems. Our approach uses active learning to balance the trade-off between accuracy and the manual effort required for test input validation. Further, by employing multiple image-comparison metrics, it achieves better results in classifying valid and invalid test inputs compared to methods that rely on single metrics. We evaluate our approach using an industrial and a public-domain dataset. Our evaluation shows that our multi-metric, active learning-based approach produces several optimal accuracy-effort trade-offs, including those deemed practical and desirable by our industry partner. Furthermore, provided with the same level of manual effort, our approach is significantly more accurate than two state-of-the-art test input validation methods, achieving an average accuracy of 97%. Specifically, the use of multiple metrics, rather than a single metric, results in an average improvement of at least 5.4% in overall accuracy compared to the state-of-the-art baselines. Incorporating an active learning loop for test input validation yields an additional 7.5% improvement in average accuracy, bringing the overall average improvement of our approach to at least 12.9% compared to the baselines.

Problem

Research questions and friction points this paper is trying to address.

Image Recognition

Accuracy Assessment

Real-world Data Testing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Learning

Image Comparison Techniques

Visual System Verification

🔎 Similar Papers

No similar papers found.