MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Reconstruction-based deep models for time-series anomaly detection often suffer from overgeneralization and struggle to discriminate anomalies from normal patterns. To address this, we propose MOMEMTO—a temporal foundation model (TFM) featuring a patch-based memory gate. Its core innovation is a lightweight, multi-domain patch memory module, initialized with hidden representations from a pretrained encoder and dynamically updated via attention mechanisms, enabling end-to-end joint fine-tuning across domains. This design effectively mitigates overgeneralization, reduces training overhead, and—crucially—represents the first successful integration of explicit memory mechanisms into TFMs. Extensive experiments on 23 univariate benchmark datasets demonstrate that MOMEMTO consistently outperforms state-of-the-art methods in both AUC and VUS metrics, with particularly pronounced gains in few-shot settings, significantly enhancing the discriminative capability of the underlying backbone model.

Technology Category

Application Category

📝 Abstract

Recently reconstruction-based deep models have been widely used for time series anomaly detection, but as their capacity and representation capability increase, these models tend to over-generalize, often reconstructing unseen anomalies accurately. Prior works have attempted to mitigate this by incorporating a memory architecture that stores prototypes of normal patterns. Nevertheless, these approaches suffer from high training costs and have yet to be effectively integrated with time series foundation models (TFMs). To address these challenges, we propose extbf{MOMEMTO}, a TFM for anomaly detection, enhanced with a patch-based memory module to mitigate over-generalization. The memory module is designed to capture representative normal patterns from multiple domains and enables a single model to be jointly fine-tuned across multiple datasets through a multi-domain training strategy. MOMEMTO initializes memory items with latent representations from a pre-trained encoder, organizes them into patch-level units, and updates them via an attention mechanism. We evaluate our method using 23 univariate benchmark datasets. Experimental results demonstrate that MOMEMTO, as a single model, achieves higher scores on AUC and VUS metrics compared to baseline methods, and further enhances the performance of its backbone TFM, particularly in few-shot learning scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addresses over-generalization in time series anomaly detection models

Reduces high training costs of memory-based prototype approaches

Enables effective integration with time series foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Patch-based memory module for over-generalization mitigation

Multi-domain training strategy for joint fine-tuning

Attention mechanism updating patch-level memory items

🔎 Similar Papers

Unlocking the Power of LSTM for Long Term Time Series Forecasting