TMUAD: Enhancing Logical Capabilities in Unified Anomaly Detection Models with a Text Memory Bank

📅 2025-08-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing unified anomaly detection methods rely on image feature extraction and memory banks, yet struggle to identify logical anomalies when normal samples are scarce. To address this, we propose TMUAD—a novel framework that introduces, for the first time, a *textual memory bank* to explicitly model semantic-logical relationships among objects. TMUAD establishes a triple-complementary memory architecture integrating class-level textual, object-level visual, and patch-level image features, enabling joint detection of structural and logical anomalies. The method synergistically combines a logic-aware textual extractor, an image segmentation module, a vision encoder, and a cross-modal retrieval mechanism, followed by multi-level anomaly score fusion to enhance discriminative capability. Evaluated on seven public industrial and medical datasets, TMUAD achieves state-of-the-art performance, particularly excelling in logical anomaly detection. The code and pretrained models are publicly released.

Technology Category

Application Category

📝 Abstract
Anomaly detection, which aims to identify anomalies deviating from normal patterns, is challenging due to the limited amount of normal data available. Unlike most existing unified methods that rely on carefully designed image feature extractors and memory banks to capture logical relationships between objects, we introduce a text memory bank to enhance the detection of logical anomalies. Specifically, we propose a Three-Memory framework for Unified structural and logical Anomaly Detection (TMUAD). First, we build a class-level text memory bank for logical anomaly detection by the proposed logic-aware text extractor, which can capture rich logical descriptions of objects from input images. Second, we construct an object-level image memory bank that preserves complete object contours by extracting features from segmented objects. Third, we employ visual encoders to extract patch-level image features for constructing a patch-level memory bank for structural anomaly detection. These three complementary memory banks are used to retrieve and compare normal images that are most similar to the query image, compute anomaly scores at multiple levels, and fuse them into a final anomaly score. By unifying structural and logical anomaly detection through collaborative memory banks, TMUAD achieves state-of-the-art performance across seven publicly available datasets involving industrial and medical domains. The model and code are available at https://github.com/SIA-IDE/TMUAD.
Problem

Research questions and friction points this paper is trying to address.

Detecting logical anomalies with limited normal data
Unifying structural and logical anomaly detection methods
Enhancing object relationship understanding through text memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text memory bank for logical anomaly detection
Three-memory framework combining structural and logical
Multi-level anomaly score fusion from complementary banks
🔎 Similar Papers
No similar papers found.
J
Jiawei Liu
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; Liaoning Liaohe Laboratory, Shenyang 110016, China; Key Laboratory on Intelligent Detection and Equipment Technology, Shenyang 110169, China
J
Jiahe Hou
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; Liaoning Liaohe Laboratory, Shenyang 110016, China; Key Laboratory on Intelligent Detection and Equipment Technology, Shenyang 110169, China; University of Chinese Academy of Sciences, Beijing, 100049, China
W
Wei Wang
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; Liaoning Liaohe Laboratory, Shenyang 110016, China; Key Laboratory on Intelligent Detection and Equipment Technology, Shenyang 110169, China
J
Jinsong Du
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; Liaoning Liaohe Laboratory, Shenyang 110016, China; Key Laboratory on Intelligent Detection and Equipment Technology, Shenyang 110169, China
Yang Cong
Yang Cong
State Key Laboratory of Robotics, SIA, Chinese Academy of Sciences (CAS)
computer visionmachine learningmultmediarobotics
Huijie Fan
Huijie Fan
Shenyang Institute of Automation, Chinese Academy of Sciences