Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval

📅 2024-11-13
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the lack of a unified theoretical framework for classical and modern Hopfield networks. We propose a family of generalized energy functions based on the Fenchel–Young loss difference, casting associative memory as a convex optimization problem. Our method integrates convex analysis, SparseMAP transformations, and energy minimization modeling. Key contributions include: (1) the first derivation of a differentiable sparse update rule via Tsallis entropy; (2) a theoretical characterization linking loss margin, sparsity, and exact single-pattern retrieval; and (3) a unifying interpretation of the energetic roles of ℓ₂-normalization, layer normalization, and other post-processing operations. Evaluated across free recall, sequential recall, image retrieval, multi-instance learning, and text rationalization tasks, our approach achieves significant improvements in recall accuracy and robustness, empirically validating the efficacy of theory-driven design.

Technology Category

Application Category

📝 Abstract
Associative memory models, such as Hopfield networks and their modern variants, have garnered renewed interest due to advancements in memory capacity and connections with self-attention in transformers. In this work, we introduce a unified framework-Hopfield-Fenchel-Young networks-which generalizes these models to a broader family of energy functions. Our energies are formulated as the difference between two Fenchel-Young losses: one, parameterized by a generalized entropy, defines the Hopfield scoring mechanism, while the other applies a post-transformation to the Hopfield output. By utilizing Tsallis and norm entropies, we derive end-to-end differentiable update rules that enable sparse transformations, uncovering new connections between loss margins, sparsity, and exact retrieval of single memory patterns. We further extend this framework to structured Hopfield networks using the SparseMAP transformation, allowing the retrieval of pattern associations rather than a single pattern. Our framework unifies and extends traditional and modern Hopfield networks and provides an energy minimization perspective for widely used post-transformations like $ell_2$-normalization and layer normalization-all through suitable choices of Fenchel-Young losses and by using convex analysis as a building block. Finally, we validate our Hopfield-Fenchel-Young networks on diverse memory recall tasks, including free and sequential recall. Experiments on simulated data, image retrieval, multiple instance learning, and text rationalization demonstrate the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Unify associative memory models with broader energy functions
Enable sparse transformations via differentiable update rules
Extend framework to retrieve pattern associations, not single patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies Hopfield networks via Fenchel-Young losses
Uses Tsallis and norm entropies for sparse transformations
Extends framework to structured pattern associations
🔎 Similar Papers
No similar papers found.
S
Saul Santos
Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal; Instituto de Telecomunicações, Lisbon, Portugal
Vlad Niculae
Vlad Niculae
University of Amsterdam
Structured PredictionNatural Language ProcessingMachine Learning
D
Daniel C. McNamee
Champalimaud Research, Lisbon, Portugal
A
André F. T. Martins
Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal; Instituto de Telecomunicações, Lisbon, Portugal; ELLIS Unit Lisbon; Unbabel, Lisbon, Portugal