PiCo: Active Manifold Canonicalization for Robust Robotic Visual Anomaly Detection

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of visual anomaly detection for robotics under complex operational conditions—such as varying 6-DoF poses, illumination changes, and shadows—where semantic anomalies and physical disturbances are tightly coupled. To tackle this, the authors propose an active manifold normalization paradigm that actively maps observations onto a condition-invariant canonical manifold through a cascaded mechanism. First, the robot actively reorients objects to reduce geometric uncertainty; then, a three-stage denoising architecture jointly suppresses multi-scale interference at the input, feature, and semantic levels. This study pioneers a shift from passive feature learning to an active physical-neural co-normalization framework, achieving state-of-the-art performance on the M2AD benchmark with 93.7% O-AUROC in static scenes and 98.5% accuracy in active closed-loop settings, substantially enhancing detection robustness and efficacy in industrial applications.

Technology Category

Application Category

📝 Abstract
Industrial deployment of robotic visual anomaly detection (VAD) is fundamentally constrained by passive perception under diverse 6-DoF pose configurations and unstable operating conditions such as illumination changes and shadows, where intrinsic semantic anomalies and physical disturbances coexist and interact. To overcome these limitations, a paradigm shift from passive feature learning to Active Canonicalization is proposed. PiCo (Pose-in-Condition Canonicalization) is introduced as a unified framework that actively projects observations onto a condition-invariant canonical manifold. PiCo operates through a cascaded mechanism. The first stage, Active Physical Canonicalization, enables a robotic agent to reorient objects in order to reduce geometric uncertainty at its source. The second stage, Neural Latent Canonicalization, adopts a three-stage denoising hierarchy consisting of photometric processing at the input level, latent refinement at the feature level, and contextual reasoning at the semantic level, progressively eliminating nuisance factors across representational scales. Extensive evaluations on the large-scale M2AD benchmark demonstrate the superiority of this paradigm. PiCo achieves a state-of-the-art 93.7% O-AUROC, representing a 3.7% improvement over prior methods in static settings, and attains 98.5% accuracy in active closed-loop scenarios. These results demonstrate that active manifold canonicalization is critical for robust embodied perception.
Problem

Research questions and friction points this paper is trying to address.

visual anomaly detection
6-DoF pose
illumination changes
physical disturbances
semantic anomalies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Canonicalization
Pose-in-Condition Canonicalization
Robotic Visual Anomaly Detection
Manifold Learning
Embodied Perception
🔎 Similar Papers
No similar papers found.
T
Teng Yan
The Hong Kong University of Science and Technology (Guangzhou)
B
Binkai Liu
The Hong Kong University of Science and Technology (Guangzhou)
Shuai Liu
Shuai Liu
University of Southern California, Information Sciences Institute
Yue Yu
Yue Yu
Hong Kong University of Science and Technology
Human-Centered AIVisualizationComputational Social Science
B
Bingzhuo Zhong
The Hong Kong University of Science and Technology (Guangzhou)