Hypergraph-Enhanced Training-Free and Language-Free Few-Shot Anomaly Detection

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses key limitations in few-shot anomaly detection—such as reliance on task-specific training, language supervision, handcrafted prompts, and poor cross-domain robustness—by proposing a purely visual, training-free, and prompt-free approach. Built upon DINOv3 features, the method introduces a sparse hyper-matching mechanism and a dual-branch image scoring strategy, enabling robust cross-domain inference through support-aware CLS matching, hypergraph reasoning, and spatial-global feature fusion. Evaluated on six industrial and medical datasets, the proposed method achieves state-of-the-art performance, significantly outperforming existing few-shot anomaly detection techniques.

📝 Abstract

Few-shot anomaly detection (FSAD) has made significant strides, yet existing methods still face critical challenges: (i) dependence on task- or dataset-specific training/fine-tuning, (ii) reliance on language supervision or carefully hand-crafted prompts, and (iii) limited robustness across domains. In this paper, we introduce HyperFSAD, a novel FSAD framework that is training-free, language-free, and robust across domains, offering a powerful solution to these challenges. Built upon DINOv3 and a hypergraph-based inference mechanism, our approach performs inference without any task-specific optimization or text prompts, while remaining competitive. Specifically, we replace sensitive nearest-neighbor / top-$n$ matching with \textbf{Sparse Hyper Matching}: \textit{sparsemax} first selects the most relevant support patches, which are then aggregated into a \textit{hyperedge} as compact normal evidence to suppress background noise and distractors. We further introduce \textbf{Dual-Branch Image Scoring}, which fuses \emph{spatial anomaly evidence} from the patch-grid anomaly map with \emph{global semantic deviation} captured by support-aware CLS matching, yielding a robust image-level anomaly score in a strictly visual manner. Notably, all components of HyperFSAD are purely visual, eliminating the need for labor-intensive hand-crafted text prompts. Under the stringent training-free and language-free setting, HyperFSAD achieves state-of-the-art performance across six datasets spanning four industrial datasets (MVTecAD, VisA, MPDD, BTAD) and two medical datasets (RESC, BraTS).

Problem

Research questions and friction points this paper is trying to address.

few-shot anomaly detection

training-free

language-free

domain robustness

anomaly detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypergraph

Training-Free

Language-Free