🤖 AI Summary
This work addresses the limited robustness and interpretability of biologically inspired Hebbian representation learning. We propose a decoupled multi-branch encoder architecture that incorporates human-defined quasi-invariant filters—such as orientation, scale, and motion—as inductive biases in early layers, jointly trained with local Hebbian plasticity rules and contrastive predictive coding. To our knowledge, this is the first systematic integration of multiple invariance priors with local plasticity mechanisms. Evaluated on image and video benchmarks—including GTSRB, STL-10, CODEBRIM, and UCF101—the approach significantly improves classification robustness and generalization. Representation quality substantially narrows the performance gap between Hebbian learning and backpropagation, approaching supervised baselines. These results validate the feasibility and promise of efficient, transparent representation learning grounded in local synaptic plasticity.
📝 Abstract
Modern data-driven machine learning system designs exploit inductive biases in architectural structure, invariance and equivariance requirements, task-specific loss functions, and computational optimization tools. Previous works have illustrated that human-specified quasi-invariant filters can serve as a powerful inductive bias in the early layers of the encoder, enhancing robustness and transparency in learned classifiers. This paper explores this further within the context of representation learning with bio-inspired Hebbian learning rules. We propose a modular framework trained with a bio-inspired variant of contrastive predictive coding, comprising parallel encoders that leverage different invariant visual descriptors as inductive biases. We evaluate the representation learning capacity of our system in classification scenarios using diverse image datasets (GTSRB, STL10, CODEBRIM) and video datasets (UCF101). Our findings indicate that this form of inductive bias significantly improves the robustness of learned representations and narrows the performance gap between models using local Hebbian plasticity rules and those using backpropagation, while also achieving superior performance compared to non-decomposed encoders.