IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

To address the challenges of poor generalization in conventional industrial anomaly detection (IAD) methods due to scarce defective samples and insufficient zero-shot performance of existing vision-language models (VLMs), this paper proposes IAD-R1—a general post-training framework. IAD-R1 introduces a novel two-stage training paradigm: (1) Perception-Activated Supervised Fine-Tuning (PA-SFT) to enhance anomaly perception; and (2) Structured Control-group Guided Relative Policy Optimization (SC-GRPO), integrating a custom reward function with chain-of-thought (CoT) reasoning to improve interpretable decision-making. To support this, we curate Expert-AD, a high-quality industrial anomaly CoT dataset. Evaluated on six standard benchmarks, our 0.5B-parameter model achieves a 43.3% average accuracy gain over prior zero-shot baselines—significantly outperforming GPT-4.1 and Claude-Sonnet-4—and establishes new state-of-the-art zero-shot IAD performance.

Technology Category

Application Category

📝 Abstract

Industrial anomaly detection is a critical component of modern manufacturing, yet the scarcity of defective samples restricts traditional detection methods to scenario-specific applications. Although Vision-Language Models (VLMs) demonstrate significant advantages in generalization capabilities, their performance in industrial anomaly detection remains limited. To address this challenge, we propose IAD-R1, a universal post-training framework applicable to VLMs of different architectures and parameter scales, which substantially enhances their anomaly detection capabilities. IAD-R1 employs a two-stage training strategy: the Perception Activation Supervised Fine-Tuning (PA-SFT) stage utilizes a meticulously constructed high-quality Chain-of-Thought dataset (Expert-AD) for training, enhancing anomaly perception capabilities and establishing reasoning-to-answer correlations; the Structured Control Group Relative Policy Optimization (SC-GRPO) stage employs carefully designed reward functions to achieve a capability leap from "Anomaly Perception" to "Anomaly Interpretation". Experimental results demonstrate that IAD-R1 achieves significant improvements across 7 VLMs, attaining up to 43.3% enhancement in average accuracy on 6 industrial anomaly detection benchmark datasets. Notably, the 0.5B parameter model trained with IAD-R1 surpasses commercial models including GPT-4.1 and Claude-Sonnet-4 in zero-shot settings, demonstrating the effectiveness and superiority of IAD-R1. The dataset, code, and all model weights will be publicly available at https://github.com/Yanhui-Lee/IAD-R1.

Problem

Research questions and friction points this paper is trying to address.

Enhancing anomaly detection in manufacturing with limited defective samples

Improving Vision-Language Models' performance in industrial anomaly detection

Developing a universal post-training framework for diverse VLM architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Universal post-training framework for VLMs

Two-stage training with PA-SFT and SC-GRPO

Enhances anomaly perception and interpretation

🔎 Similar Papers

Customizing Visual-Language Foundation Models for Multi-modal Anomaly Detection and Reasoning