Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models

📅 2025-05-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies object scale (ROI-to-image ratio) and spatial location (eccentricity) as primary spatial biases that induce spurious background correlations in vision models. To systematically investigate this, we introduce Hard-Spurious-ImageNet—the first synthetic benchmark enabling controlled disentanglement of scale, position, and background—designed to rigorously evaluate robustness of mainstream ImageNet models (e.g., ResNet, ViT). Experiments reveal a severe degradation (>40% drop in worst-group accuracy) when objects are both small and highly eccentric. Critically, existing bias-mitigation methods (e.g., IRM, GroupDRO) improve worst-group accuracy by less than 2% under these conditions, exposing their fundamental failure to model spatial structural biases. This study is the first to establish spatial dimensions as a core causal factor behind spurious correlations. We propose an interpretable, controllable synthetic diagnostic framework, providing a new benchmark and analytical paradigm for robust visual representation learning.

Technology Category

Application Category

📝 Abstract
Backgrounds in images play a major role in contributing to spurious correlations among different data points. Owing to aesthetic preferences of humans capturing the images, datasets can exhibit positional (location of the object within a given frame) and size (region-of-interest to image ratio) biases for different classes. In this paper, we show that these biases can impact how much a model relies on spurious features in the background to make its predictions. To better illustrate our findings, we propose a synthetic dataset derived from ImageNet1k, Hard-Spurious-ImageNet, which contains images with various backgrounds, object positions, and object sizes. By evaluating the dataset on different pretrained models, we find that most models rely heavily on spurious features in the background when the region-of-interest (ROI) to image ratio is small and the object is far from the center of the image. Moreover, we also show that current methods that aim to mitigate harmful spurious features, do not take into account these factors, hence fail to achieve considerable performance gains for worst-group accuracies when the size and location of core features in an image change.
Problem

Research questions and friction points this paper is trying to address.

Impact of object size and position on model reliance on spurious background features
Existing datasets exhibit biases in object location and region-of-interest ratio
Current mitigation methods fail to address size and location-based spurious features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic dataset Hard-Spurious-ImageNet for bias analysis
Evaluate models on varied backgrounds, positions, sizes
Highlight limitations of current spurious feature mitigation methods
🔎 Similar Papers
No similar papers found.