Memory-Modular Classification: Learning to Generalize with Memory Replacement

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This paper addresses the problem of model-agnostic generalization to unseen categories in image classification—without retraining the base model. We propose a Memory-Modular Image Classifier that decouples knowledge storage (an external multimodal vision-language memory) from inference, enabling zero-shot, few-shot, fine-grained, and class-incremental classification solely through external memory replacement. Our method leverages joint vision-language representations, dynamic memory retrieval, and meta-learned noise-augmented data generation to construct a cacheable, swappable external knowledge module. To our knowledge, this is the first work to establish a pure memory-content-update-driven paradigm for cross-category generalization. Extensive experiments demonstrate significant performance gains over fine-tuning and prompt-engineering baselines across diverse classification tasks, while requiring zero parameter updates—ensuring strong adaptability and robustness across domains and deployment scenarios.

Technology Category

Application Category

📝 Abstract

We propose a novel memory-modular learner for image classification that separates knowledge memorization from reasoning. Our model enables effective generalization to new classes by simply replacing the memory contents, without the need for model retraining. Unlike traditional models that encode both world knowledge and task-specific skills into their weights during training, our model stores knowledge in the external memory of web-crawled image and text data. At inference time, the model dynamically selects relevant content from the memory based on the input image, allowing it to adapt to arbitrary classes by simply replacing the memory contents. The key differentiator that our learner meta-learns to perform classification tasks with noisy web data from unseen classes, resulting in robust performance across various classification scenarios. Experimental results demonstrate the promising performance and versatility of our approach in handling diverse classification tasks, including zero-shot/few-shot classification of unseen classes, fine-grained classification, and class-incremental classification.

Problem

Research questions and friction points this paper is trying to address.

Separates knowledge memorization from reasoning in image classification

Enables generalization to new classes without model retraining

Meta-learns classification with noisy web data from unseen classes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-modular learner separates memorization and reasoning

Dynamic memory content replacement avoids model retraining

Meta-learning with noisy web data enables robust classification

🔎 Similar Papers

No similar papers found.

Toyota Research Institute

Los Altos, CA / Cambridge, MA

RE / RS - Foundations, Search

OpenAI

$445K – $555K • Offers Equity

San Francisco, CA, USA

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)