🤖 AI Summary
This work addresses the challenge of explicitly controlling memorization behavior in language models during training, a capability lacking in existing approaches that only support post-hoc detection. The authors propose Memory Dial, a novel framework that introduces memory pressure as a tunable hyperparameter α into the training objective by interpolating between standard cross-entropy and a temperature-sharpened target, thereby enabling continuous control over memorization strength. Evaluated across six architectures and five benchmarks, the method demonstrates that increasing α monotonically improves accuracy on seen samples while maintaining stable performance on unseen data. Experiments further reveal that larger models exhibit heightened sensitivity to memory pressure and that high-frequency sequences are more readily memorized, establishing a clear relationship among model scale, sequence frequency, and memorization capacity.
📝 Abstract
Memorization in language models is widely studied but remains difficult to isolate and control. Understanding when and what models memorize is essential for explaining their predictions, yet existing approaches are post-hoc: they can detect memorization in trained models, but cannot disentangle its effects from architecture, data, or optimization. We introduce Memory Dial, a training framework that makes memorization pressure an explicit, controllable variable. Memory Dial interpolates between standard cross-entropy and a temperature-sharpened objective via a single parameter $α$, producing a family of models identical in architecture and training setup (within each sweep), differing only in memorization pressure. Experiments across six architectures and five benchmarks demonstrate that: (1) $α$ reliably controls memorization pressure, with seen-example accuracy increasing monotonically while unseen accuracy remains stable; (2) larger models are more responsive to memorization pressure; and (3) frequent sequences are easier to memorize than rare ones. Additional analyses show that the effect is robust across a range of sharpening temperatures, differs qualitatively from single-temperature cross-entropy, transfers to multilingual settings, and is detectable even on naturally occurring single-occurrence sequences. Memory Dial provides a controlled experimental framework for studying how memorization behavior emerges and interacts with generalization in language models.