Are Foundation Models Ready for Industrial Defect Recognition? A Reality Check on Real-World Data

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Foundation models (FMs) exhibit strong performance on public benchmarks but show questionable generalization to real-world industrial inspection images, raising concerns about their practical applicability in defect identification. Method: We systematically evaluate state-of-the-art vision FMs under zero-shot classification and text-prompted paradigms, using both public benchmarks and a newly constructed industrial defect dataset comprising authentic production-line imagery. Contribution/Results: Experiments reveal that while all models achieve high accuracy on public benchmarks, they consistently fail on real industrial images—exposing a critical cross-domain generalization bottleneck. This challenges the “plug-and-play” assumption for FM deployment in industrial vision and empirically uncovers the fundamental domain gap between academic benchmarks and factory-floor conditions. Our work provides the first large-scale, real-world industrial defect benchmark and delivers key empirical evidence to guide future research on domain-adaptive foundation model tuning for industrial visual inspection.

Technology Category

Application Category

📝 Abstract
Foundation Models (FMs) have shown impressive performance on various text and image processing tasks. They can generalize across domains and datasets in a zero-shot setting. This could make them suitable for automated quality inspection during series manufacturing, where various types of images are being evaluated for many different products. Replacing tedious labeling tasks with a simple text prompt to describe anomalies and utilizing the same models across many products would save significant efforts during model setup and implementation. This is a strong advantage over supervised Artificial Intelligence (AI) models, which are trained for individual applications and require labeled training data. We test multiple recent FMs on both custom real-world industrial image data and public image data. We show that all of those models fail on our real-world data, while the very same models perform well on public benchmark datasets.
Problem

Research questions and friction points this paper is trying to address.

Evaluating foundation models' readiness for industrial defect recognition
Testing FM generalization on real-world industrial versus benchmark data
Assessing zero-shot FM performance for automated quality inspection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Foundation Models tested on industrial defect data
Zero-shot generalization capability evaluated for manufacturing
Performance gap identified between benchmark and real-world data
🔎 Similar Papers
No similar papers found.
Simon Baeuerle
Simon Baeuerle
PhD candidate, Karlsruhe Institute of Technology (KIT)
Artificial Intelligence
P
Pratik Khanna
Institute for Automation and Applied Informatics (IAI), Karlsruhe Institute of Technology
Nils Friederich
Nils Friederich
Doctoral Student, Karlsruhe Institute of Technology
Deep LearningBioMedical Image Processing
A
Angelo Jovin Yamachui Sitcheu
Institute for Automation and Applied Informatics (IAI), Karlsruhe Institute of Technology
D
Damir Shakirov
Bosch Center for Artificial Intelligence, Robert Bosch GmbH
A
Andreas Steimer
Bosch Center for Artificial Intelligence, Robert Bosch GmbH
Ralf Mikut
Ralf Mikut
Karlsruhe Institute of Technology (Germany)
data miningimage processingcomputational intelligencezebrafishenergy informatics