Kaputt: A Large-Scale Dataset for Visual Defect Detection

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing industrial anomaly detection methods exhibit insufficient generalization under the high variability of object pose and appearance in retail logistics, compounded by the absence of a domain-specific benchmark. Method: We introduce RL-AD, the first large-scale defect detection benchmark tailored to retail logistics, comprising 233,000 real-world images, 48,000 object instances, and 29,000 defective samples—40× larger than MVTec-AD. RL-AD features multi-view, multi-illumination acquisition and fine-grained annotations compatible with unsupervised and few-shot settings. Contribution/Results: Extensive evaluation reveals that state-of-the-art methods achieve a maximum AUROC of only 56.96% on RL-AD—substantially lower than their performance on conventional benchmarks—demonstrating the dataset’s heightened difficulty and realism. RL-AD bridges a critical gap in anomaly detection evaluation for complex, uncontrolled industrial environments, thereby advancing algorithmic robustness toward high-variability, low-controllability real-world deployment.

Technology Category

Application Category

📝 Abstract
We present a novel large-scale dataset for defect detection in a logistics setting. Recent work on industrial anomaly detection has primarily focused on manufacturing scenarios with highly controlled poses and a limited number of object categories. Existing benchmarks like MVTec-AD [6] and VisA [33] have reached saturation, with state-of-the-art methods achieving up to 99.9% AUROC scores. In contrast to manufacturing, anomaly detection in retail logistics faces new challenges, particularly in the diversity and variability of object pose and appearance. Leading anomaly detection methods fall short when applied to this new setting. To bridge this gap, we introduce a new benchmark that overcomes the current limitations of existing datasets. With over 230,000 images (and more than 29,000 defective instances), it is 40 times larger than MVTec-AD and contains more than 48,000 distinct objects. To validate the difficulty of the problem, we conduct an extensive evaluation of multiple state-of-the-art anomaly detection methods, demonstrating that they do not surpass 56.96% AUROC on our dataset. Further qualitative analysis confirms that existing methods struggle to leverage normal samples under heavy pose and appearance variation. With our large-scale dataset, we set a new benchmark and encourage future research towards solving this challenging problem in retail logistics anomaly detection. The dataset is available for download under https://www.kaputt-dataset.com.
Problem

Research questions and friction points this paper is trying to address.

Addresses visual defect detection challenges in retail logistics environments
Overcomes limitations of existing datasets with 40x larger scale
Solves anomaly detection under heavy pose and appearance variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces large-scale defect detection dataset for logistics
Overcomes limitations of controlled manufacturing scenarios
Enables benchmarking under heavy pose variation
🔎 Similar Papers
No similar papers found.
Sebastian Höfer
Sebastian Höfer
Manager Applied Science, Amazon Fulfillment Technologies & Robotics
Computer VisionMachine LearningRobotics
D
Dorian Henning
Amazon, Fulfillment Technologies & Robotics
A
Artemij Amiranashvili
Amazon, Fulfillment Technologies & Robotics
D
Douglas Morrison
Amazon, Fulfillment Technologies & Robotics
Mariliza Tzes
Mariliza Tzes
Applied Scientist II, Amazon
RoboticsAutonomous NavigationComputer VisionArtificial IntelligenceControl
Ingmar Posner
Ingmar Posner
Oxford University
Roboticsperceptionmachine learningclassificationvision
M
Marc Matvienko
Amazon, Fulfillment Technologies & Robotics
A
Alessandro Rennola
Amazon, Fulfillment Technologies & Robotics
Anton Milan
Anton Milan
Amazon
Computer VisionRobotics