Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations

📅 2024-09-17
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization of Gaussian derivative networks to unseen spatial scales. To rigorously evaluate scale robustness, we construct scaled variants (4× up/down) of Fashion-MNIST and CIFAR-10, and conduct systematic benchmarking on the STIR dataset. We propose three key innovations: (1) scale-channel dropout for regularization; (2) spatial max-pooling to enhance localization of off-center objects; and (3) average pooling—replacing max-pooling—to enable cross-scale feature fusion. Our architecture employs discretized Gaussian kernels and central difference operators to ensure scale covariance or invariance, while activation maps and receptive field visualizations improve model interpretability. Experiments demonstrate that our method significantly outperforms mainstream deep networks on unseen-scale test sets. The discretized implementation achieves optimal performance, delivering strong scale generalization, precise object localization, and high model transparency.

Technology Category

Application Category

📝 Abstract
This paper presents an in-depth analysis of the scale generalisation properties of the scale-covariant and scale-invariant Gaussian derivative networks, complemented with both conceptual and algorithmic extensions. For this purpose, Gaussian derivative networks are evaluated on new rescaled versions of the Fashion-MNIST and the CIFAR-10 datasets, with spatial scaling variations over a factor of 4 in the testing data, that are not present in the training data. Additionally, evaluations on the previously existing STIR datasets show that the Gaussian derivative networks achieve better scale generalisation than previously reported for these datasets for other types of deep networks. We first experimentally demonstrate that the Gaussian derivative networks have quite good scale generalisation properties on the new datasets, and that average pooling of feature responses over scales may sometimes also lead to better results than the previously used approach of max pooling over scales. Then, we demonstrate that using a spatial max pooling mechanism after the final layer enables localisation of non-centred objects in image domain, with maintained scale generalisation properties. We also show that regularisation during training, by applying dropout across the scale channels, referred to as scale-channel dropout, improves both the performance and the scale generalisation. In additional ablation studies, we demonstrate that discretisations of Gaussian derivative networks, based on the discrete analogue of the Gaussian kernel in combination with central difference operators, perform best or among the best, compared to a set of other discrete approximations of the Gaussian derivative kernels. Finally, by visualising the activation maps and the learned receptive fields, we demonstrate that the Gaussian derivative networks have very good explainability properties.
Problem

Research questions and friction points this paper is trying to address.

Analyzing scale generalization in Gaussian derivative networks
Evaluating performance on rescaled Fashion-MNIST and CIFAR-10 datasets
Improving scale generalization via scale-channel dropout and pooling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Gaussian derivative networks for scale generalization
Implements scale-channel dropout for regularization
Applies spatial max pooling for object localization
🔎 Similar Papers
No similar papers found.
A
Andrzej Perzanowski
Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
Tony Lindeberg
Tony Lindeberg
Professor of Computer Science - Computational Vision, KTH Royal Institute of Technology
Computer VisionScale SpaceRecognitionImage AnalysisNeuroscience