🤖 AI Summary
Convolutional neural networks (CNNs) for image classification often exhibit implicit bias due to reliance on background cues rather than semantic object features, undermining decision interpretability and fairness.
Method: We propose a bias detection framework that operates without requiring blank-background images. Our approach innovatively combines random patch shuffling with multi-scale spatial- and frequency-domain transformations—including Fourier, wavelet, and median filtering—to disentangle contextual semantics from spurious background noise.
Results: Extensive experiments across six diverse datasets—spanning natural, synthetic, and hybrid domains—demonstrate that our method effectively exposes CNNs’ unintended dependence on background artifacts. It significantly improves discrimination between semantic features and confounding background information, thereby enhancing model robustness and interpretability. Crucially, the framework provides a reproducible, minimally invasive paradigm for fairness assessment in vision models—requiring no architectural modification or retraining, and avoiding unrealistic data assumptions such as uniform background removal.
📝 Abstract
CNNs are now prevalent as the primary choice for most machine vision problems due to their superior rate of classification and the availability of user-friendly libraries. These networks effortlessly identify and select features in a non-intuitive data-driven manner, making it difficult to determine which features were most influential. That leads to a ``black box", where users cannot know how the image data are analyzed but rely on empirical results. Therefore the decision-making process can be biased by background information that is difficult to detect. Here we discuss examples of such hidden biases and propose techniques for identifying them, methods to distinguish between contextual information and background noise, and explore whether CNNs learn from irrelevant features. One effective approach to identify dataset bias is to classify blank background parts of the images. However, in some situations a blank background in the images is not available, making it more difficult to separate the foreground information from the blank background. Such parts of the image can also be considered contextual learning, not necessarily bias. To overcome this, we propose two approaches that were tested on six different datasets, including natural, synthetic, and hybrid datasets. The first method involves dividing images into smaller, non-overlapping tiles of various sizes, which are then shuffled randomly, making classification more challenging. The second method involves the application of several image transforms, including Fourier, Wavelet transforms, and Median filter, and their combinations. These transforms help recover background noise information used by CNN to classify images. Results indicate that this method can effectively distinguish between contextual information and background noise, and alert on the presence of background noise even without the need to use background information.