Vision Hopfield Memory Networks

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a brain-inspired visual foundation network that addresses the limitations of existing models—namely their lack of neurobiological plausibility, heavy reliance on large-scale data, and poor interpretability. By hierarchically integrating local and global Hopfield associative memory modules for the first time and incorporating a predictive coding–driven iterative refinement mechanism, the model unifies the representation of both local and global visual dynamics. This architecture substantially enhances data efficiency, interpretability, and biological fidelity while achieving performance on par with state-of-the-art backbone networks across major visual benchmarks.

Technology Category

Application Category

📝 Abstract
Recent vision and multimodal foundation backbones, such as Transformer families and state-space models like Mamba, have achieved remarkable progress, enabling unified modeling across images, text, and beyond. Despite their empirical success, these architectures remain far from the computational principles of the human brain, often demanding enormous amounts of training data while offering limited interpretability. In this work, we propose the Vision Hopfield Memory Network (V-HMN), a brain-inspired foundation backbone that integrates hierarchical memory mechanisms with iterative refinement updates. Specifically, V-HMN incorporates local Hopfield modules that provide associative memory dynamics at the image patch level, global Hopfield modules that function as episodic memory for contextual modulation, and a predictive-coding-inspired refinement rule for iterative error correction. By organizing these memory-based modules hierarchically, V-HMN captures both local and global dynamics in a unified framework. Memory retrieval exposes the relationship between inputs and stored patterns, making decisions more interpretable, while the reuse of stored patterns improves data efficiency. This brain-inspired design therefore enhances interpretability and data efficiency beyond existing self-attention- or state-space-based approaches. We conducted extensive experiments on public computer vision benchmarks, and V-HMN achieved competitive results against widely adopted backbone architectures, while offering better interpretability, higher data efficiency, and stronger biological plausibility. These findings highlight the potential of V-HMN to serve as a next-generation vision foundation model, while also providing a generalizable blueprint for multimodal backbones in domains such as text and audio, thereby bridging brain-inspired computation with large-scale machine learning.
Problem

Research questions and friction points this paper is trying to address.

vision foundation models
data efficiency
interpretability
brain-inspired computation
associative memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hopfield Memory
Brain-inspired Architecture
Hierarchical Memory
Iterative Refinement
Interpretability
🔎 Similar Papers
No similar papers found.
J
Jianfeng Wang
Department of Computer Science, University of Oxford, United Kingdom
A
Amine M'Charrak
Department of Computer Science, University of Oxford, United Kingdom
L
Luk Koska
Institute of Logic and Computation, Vienna University of Technology, Austria
X
Xiangtao Wang
Institute of Logic and Computation, Vienna University of Technology, Austria
D
Daniel Petriceanu
Institute of Logic and Computation, Vienna University of Technology, Austria
M
Mykyta Smyrnov
Institute of Logic and Computation, Vienna University of Technology, Austria
R
Ruizhi Wang
Institute of Logic and Computation, Vienna University of Technology, Austria
M
Michael Bumbar
Institute of Logic and Computation, Vienna University of Technology, Austria
Luca Pinchetti
Luca Pinchetti
Dphil Computer Science, Oxford University
Deep learning
Thomas Lukasiewicz
Thomas Lukasiewicz
Vienna University of Technology, Austria; University of Oxford, UK
Artificial IntelligenceMachine LearningInformation Systems