Adopting a human developmental visual diet yields robust, shape-based AI vision

📅 2025-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI vision systems exhibit weak shape perception, strong texture bias, poor robustness, and limited abstract recognition capability. To address these limitations, this paper proposes the Developmental Visual Diet (DVD), a novel training paradigm inspired by human visual development. DVD formalizes the progression from infant to adult vision into a quantifiable, staged curriculum: early stages emphasize shape priors, while complexity—such as textured surfaces and cluttered backgrounds—is incrementally introduced. Crucially, DVD requires no model scaling; instead, it leverages curriculum-based data scheduling and psychophysics- and neurophysiology-informed training strategies. Empirical results demonstrate substantial improvements in shape reliance, abstract pattern recognition accuracy, and robustness against geometric distortions and adversarial attacks—achieving state-of-the-art performance even with reduced training data and surpassing larger models. The core contribution lies in rigorously translating developmental cognitive principles into a computationally executable visual learning curriculum, advancing AI vision toward human-like perceptual cognition.

Technology Category

Application Category

📝 Abstract
Despite years of research and the dramatic scaling of artificial intelligence (AI) systems, a striking misalignment between artificial and human vision persists. Contrary to humans, AI heavily relies on texture-features rather than shape information, lacks robustness to image distortions, remains highly vulnerable to adversarial attacks, and struggles to recognise simple abstract shapes within complex backgrounds. To close this gap, we here introduce a solution that arises from a previously underexplored direction: rather than scaling up, we take inspiration from how human vision develops from early infancy into adulthood. We quantified the visual maturation by synthesising decades of psychophysical and neurophysiological research into a novel developmental visual diet (DVD) for AI vision. We show that guiding AI systems through this human-inspired curriculum produces models that closely align with human behaviour on every hallmark of robust vision tested yielding the strongest reported reliance on shape information to date, abstract shape recognition beyond the state of the art, higher robustness to image corruptions, and stronger resilience to adversarial attacks. By outperforming high parameter AI foundation models trained on orders of magnitude more data, we provide evidence that robust AI vision can be achieved by guiding the way how a model learns, not merely how much it learns, offering a resource-efficient route toward safer and more human-like artificial visual systems.
Problem

Research questions and friction points this paper is trying to address.

AI vision lacks shape-based recognition like humans
Current AI models are vulnerable to adversarial attacks
AI struggles with abstract shapes in complex backgrounds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-inspired developmental visual diet for AI
Shape-based AI vision surpassing texture reliance
Resource-efficient robust AI vision training
🔎 Similar Papers
No similar papers found.
Z
Zejin Lu
Machine Learning Group, Institute for Cognitive Science, Osnabrück University, Osnabrück, Germany.
S
Sushrut Thorat
Machine Learning Group, Institute for Cognitive Science, Osnabrück University, Osnabrück, Germany.
R
Radoslaw M Cichy
Neural Dynamics of Visual Cognition Group, Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany.
Tim C Kietzmann
Tim C Kietzmann
Institute of Cognitive Science, University of Osnabrück
cognitive computational neurosciencevisionmachine learningdeep learningcomputational modeling