From Images to Perception: Emergence of Perceptual Properties by Reconstructing Images

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether human-like visual perception can emerge unsupervised from natural image statistics. Method: We propose PerceptNet, a biologically inspired architecture modeling retinal–V1 processing, trained end-to-end via joint optimization of multiple self-supervised objectives: image reconstruction (autoencoding), denoising, deblurring, and sparse regularization—without any perceptual labels. Contribution/Results: The learned encoding-layer representations achieve remarkably high alignment with human subjective perceptual judgments (Pearson’s *r* > 0.9), matching human performance under moderate noise, blur, and sparsity constraints. Critically, this correspondence emerges purely from statistical regularities in natural images, without task-specific supervision. Our work provides the first systematic computational evidence that biologically grounded models can spontaneously develop human-aligned perceptual metrics solely through unsupervised learning on image statistics. These findings substantiate efficient coding theories of early vision and establish a new paradigm for modeling perceptual representation emergence.

Technology Category

Application Category

📝 Abstract
A number of scientists suggested that human visual perception may emerge from image statistics, shaping efficient neural representations in early vision. In this work, a bio-inspired architecture that can accommodate several known facts in the retina-V1 cortex, the PerceptNet, has been end-to-end optimized for different tasks related to image reconstruction: autoencoding, denoising, deblurring, and sparsity regularization. Our results show that the encoder stage (V1-like layer) consistently exhibits the highest correlation with human perceptual judgments on image distortion despite not using perceptual information in the initialization or training. This alignment exhibits an optimum for moderate noise, blur and sparsity. These findings suggest that the visual system may be tuned to remove those particular levels of distortion with that level of sparsity and that biologically inspired models can learn perceptual metrics without human supervision.
Problem

Research questions and friction points this paper is trying to address.

Explores how human visual perception emerges from image statistics
Develops bio-inspired model for image reconstruction tasks
Shows model aligns with human perception without supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bio-inspired PerceptNet optimizes image reconstruction tasks
Encoder aligns with human perception without supervision
Optimal performance at moderate noise and blur
🔎 Similar Papers
No similar papers found.