🤖 AI Summary
This paper introduces a scene-adaptive anomaly detection paradigm that identifies “out-of-place” objects within each scene by modeling geometric and semantic consistency among typical object groups to localize scene-specific anomalies. Methodologically, it proposes the first object-level multi-view 3D reconstruction framework, generating geometrically consistent, part-aware instance representations, and integrates cross-instance contrastive learning with a neighborhood-referenced dynamic anomaly criterion. Key contributions include: (1) a formal definition of neighborhood-referenced singularity detection as a novel task; (2) the release of two new benchmarks—ToysAD-8K (toy-level) and PartsAD-15K (part-level)—the first of their kind; and (3) state-of-the-art performance on both benchmarks, with superior robustness to occlusion and strong interpretability. The approach consistently outperforms existing methods across diverse evaluation metrics, demonstrating its effectiveness in capturing fine-grained structural and semantic deviations within complex scenes.
📝 Abstract
This paper introduces a novel anomaly detection (AD) problem aimed at identifying `odd-looking' objects within a scene by comparing them to other objects present. Unlike traditional AD benchmarks with fixed anomaly criteria, our task detects anomalies specific to each scene by inferring a reference group of regular objects. To address occlusions, we use multiple views of each scene as input, construct 3D object-centric models for each instance from 2D views, enhancing these models with geometrically consistent part-aware representations. Anomalous objects are then detected through cross-instance comparison. We also introduce two new benchmarks, ToysAD-8K and PartsAD-15K as testbeds for future research in this task. We provide a comprehensive analysis of our method quantitatively and qualitatively on these benchmarks.