Foveated Retinotopy Improves Classification and Localization in CNNs

📅 2024-02-23
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited robustness of image recognition and localization under scale, rotation, and spatial transformations, this paper proposes a biologically inspired foveated retinotopic input mechanism—motivated by the primate fovea—that differentiably embeds a dynamic, fixation-dependent retinal map (high-resolution fovea, low-resolution periphery) into the CNN input layer for the first time. Without modifying the backbone architecture (e.g., ResNet), this input transformation implicitly encodes object geometry and enables multi-fixation inference. Experiments show that classification accuracy is preserved while robustness to scale and rotation perturbations is significantly enhanced. Crucially, high-precision weakly supervised localization is achieved solely from multi-location classification responses—eliminating the need for dedicated detection heads or bounding-box annotations. This establishes a novel paradigm for unsupervised and weakly supervised visual understanding.

Technology Category

Application Category

📝 Abstract
From a falcon detecting prey to humans recognizing faces, many species exhibit extraordinary abilities in rapid visual localization and classification. These are made possible by a specialized retinal region called the fovea, which provides high acuity at the center of vision while maintaining lower resolution in the periphery. This distinctive spatial organization, preserved along the early visual pathway through retinotopic mapping, is fundamental to biological vision, yet remains largely unexplored in machine learning. Our study investigates how incorporating foveated retinotopy may benefit deep convolutional neural networks (CNNs) in image classification tasks. By implementing a foveated retinotopic transformation in the input layer of standard ResNet models and re-training them, we maintain comparable classification accuracy while enhancing the network's robustness to scale and rotational perturbations. Although this architectural modification introduces increased sensitivity to fixation point shifts, we demonstrate how this apparent limitation becomes advantageous: variations in classification probabilities across different gaze positions serve as effective indicators for object localization. Our findings suggest that foveated retinotopic mapping encodes implicit knowledge about visual object geometry, offering an efficient solution to the visual search problem - a capability crucial for many living species.
Problem

Research questions and friction points this paper is trying to address.

Biological Foveation
Computer Vision
Robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Focal Retinotopy
Robust Object Recognition
Position-sensitive Feature Extraction
🔎 Similar Papers
No similar papers found.
J
Jean-Nicolas J'er'emie
Institut de Neurosciences de la Timone, Aix-Marseille Université - CNRS UMR 7289, Marseille, France
E
Emmanuel Dauc'e
Institut de Neurosciences de la Timone, Aix-Marseille Université - CNRS UMR 7289, Marseille, France; École Centrale Méditerranée, Marseille, France
Laurent Perrinet
Laurent Perrinet
Institut de Neurosciences de la Timone, Aix-Marseille Université - CNRS UMR 7289, Marseille, France