Hyperparameter Transfer for Dense Associative Memories

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

240K/year
🤖 AI Summary
Existing hyperparameter transfer methods struggle to adapt to dense associative memory models featuring intra- and inter-layer weight sharing along with spiking activation functions. This work proposes the first architecture-specific hyperparameter transfer approach for such models, establishing a scalable framework that enables effective transfer from small- to large-scale instances. By integrating energy landscape dynamics analysis, theoretical derivation, and empirical validation, the method successfully addresses the challenges posed by weight sharing and non-standard activations. Theoretical predictions align closely with experimental results, demonstrating significantly improved efficiency and accuracy in cross-scale hyperparameter transfer.
📝 Abstract
Dense Associative Memory (DenseAM) is a promising family of AI architectures that is represented by a neural network performing temporal dynamics on an energy landscape. While hyperparameter transfer methods are well-studied for feed-forward networks, these methods have not been developed for settings in which weights are shared across layers and within the layer, which is common in DenseAMs. Additionally, DenseAMs utilize rapidly peaking activation functions that are rarely used in feed-forward architectures. The confluence of these aspects makes DenseAM a challenging framework for using existing methods for hyperparameter transfer. Our work initiates the development of hyperparameter transfer methods for this class of models. We derive explicit prescriptions for how the hyperparameters tuned on small models can be transferred to models trained at scale. We demonstrate excellent agreement between these theoretical findings and empirical results.
Problem

Research questions and friction points this paper is trying to address.

Hyperparameter Transfer
Dense Associative Memory
Weight Sharing
Peaking Activation Functions
Energy Landscape
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hyperparameter Transfer
Dense Associative Memory
Weight Sharing
Peaking Activation Functions
Energy Landscape
🔎 Similar Papers
2024-05-10arXiv.orgCitations: 2