Detecting Systematic Weaknesses in Vision Models along Predefined Human-Understandable Dimensions

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

Current vision model safety verification struggles to systematically identify human-interpretable weaknesses—such as those tied to Operational Design Domain (ODD) elements—in unstructured visual data. Method: We propose the first framework that explicitly embeds predefined safety dimensions into weakness detection, integrating multimodal foundation models (CLIP, SAM), error-driven slice mining, and metadata-noise-robust calibration to construct a noise-resilient combinatorial search algorithm. Crucially, it operates without requiring high-quality semantic annotations, directly discovering performance-vulnerable subsets aligned with safety experts’ definitions from raw images. Results: Our method significantly outperforms existing slice discovery approaches on four benchmark datasets—Cityscapes, BDD100k, RailSem19, and Mapillary Vistas—delivering interpretable, dimension-aligned, and traceable weakness analyses essential for rigorous safety assurance and certification.

Technology Category

Application Category

📝 Abstract

Studying systematic weaknesses of DNNs has gained prominence in the last few years with the rising focus on building safe AI systems. Slice discovery methods (SDMs) are prominent algorithmic approaches for finding such systematic weaknesses. They identify top-k semantically coherent slices/subsets of data where a DNN-under-test has low performance. For being directly useful, e.g., as evidences in a safety argumentation, slices should be aligned with human-understandable (safety-relevant) dimensions, which, for example, are defined by safety and domain experts as parts of the operational design domain (ODD). While straightforward for structured data, the lack of semantic metadata makes these investigations challenging for unstructured data. Therefore, we propose a complete workflow which combines contemporary foundation models with algorithms for combinatorial search that consider structured data and DNN errors for finding systematic weaknesses in images. In contrast to existing approaches, ours identifies weak slices that are in line with predefined human-understandable dimensions. As the workflow includes foundation models, its intermediate and final results may not always be exact. Therefore, we build into our workflow an approach to address the impact of noisy metadata. We evaluate our approach w.r.t. its quality on four popular computer vision datasets, including autonomous driving datasets like Cityscapes, BDD100k, and RailSem19, while using multiple state-of-the-art models as DNNs-under-test.

Problem

Research questions and friction points this paper is trying to address.

Detecting systematic weaknesses in vision models

Aligning weaknesses with human-understandable dimensions

Addressing noisy metadata in workflow results

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses foundation models

Combinatorial search algorithms

Addresses noisy metadata impact

🔎 Similar Papers

No similar papers found.