🤖 AI Summary
Current visual systems exhibit insufficient robustness to noise, deformation, and out-of-distribution (OOD) data, and lack unsupervised, compositional, and interpretable neural representation mechanisms. To address these limitations, we propose the Collaborative Network Architecture (CNA), which introduces a novel “network fragment” dynamic composition mechanism. CNA unsupervisedly discovers local structural primitives via statistical learning and recursively assembles them through structured sparse connectivity, yielding global–local coupled representations that adaptively encode sensory patterns. Without any labeled data, CNA achieves robust recognition under noise and geometric deformation, zero-shot schema completion, and generalization to unseen patterns—significantly enhancing OOD generalization. Its modular, interpretable neural representations establish a new paradigm for invariant object recognition, bridging compositional structure with biological plausibility and computational efficiency.
📝 Abstract
We introduce the Cooperative Network Architecture (CNA), a model that represents sensory signals using structured, recurrently connected networks of neurons, termed"nets."Nets are dynamically assembled from overlapping net fragments, which are learned based on statistical regularities in sensory input. This architecture offers robustness to noise, deformation, and out-of-distribution data, addressing challenges in current vision systems from a novel perspective. We demonstrate that net fragments can be learned without supervision and flexibly recombined to encode novel patterns, enabling figure completion and resilience to noise. Our findings establish CNA as a promising paradigm for developing neural representations that integrate local feature processing with global structure formation, providing a foundation for future research on invariant object recognition.