Problem
Research questions and friction points this paper is trying to address.
Memory Mosaics achieve predictive disentanglement in a transparent way.
Memory Mosaics perform comparably or better than transformers in language modeling.
Memory Mosaics demonstrate compositional and in-context learning capabilities.
Innovation
Methods, ideas, or system contributions that make the work stand out.
Networks of associative memories for prediction tasks
Compositional and in-context learning capabilities
Transparent predictive disentanglement compared to transformers