Error Slice Discovery via Manifold Compactness

📅 2025-01-31

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Systematic errors of deep learning models on specific data subsets remain difficult to detect and attribute, hindering model diagnosis and improvement. This paper proposes an unsupervised error slicing method that requires no prior labels, automatically identifying semantically coherent and interpretable underperforming subpopulations. Our core contribution is the first formal definition of slice coherence based on manifold compactness—explicitly embedding data geometric structure into the optimization objective to jointly minimize risk and maximize intra-slice cohesion in an end-to-end differentiable framework. The method integrates manifold learning, geometric deep learning, and differentiable subset search, supporting multimodal tasks including image and text classification. Evaluated on multiple benchmarks, it significantly outperforms state-of-the-art approaches, yielding more interpretable slices and substantially enhancing diagnostic utility for model analysis and refinement.

Technology Category

Application Category

📝 Abstract

Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to the absence of an explicit coherence metric, which could potentially hinder their effectiveness. In this paper, we propose manifold compactness, a coherence metric without reliance on extra information by incorporating the data geometry property into its design, and experiments on typical datasets empirically validate the rationality of the metric. Then we develop Manifold Compactness based error Slice Discovery (MCSD), a novel algorithm that directly treats risk and coherence as the optimization objective, and is flexible to be applied to models of various tasks. Extensive experiments on the benchmark and case studies on other typical datasets demonstrate the superiority of MCSD.

Problem

Research questions and friction points this paper is trying to address.

Deep Learning

Error Analysis

Model Improvement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Manifold Compactness

MCSD Algorithm

Error Type Optimization

🔎 Similar Papers

No similar papers found.

Authors to Follow