🤖 AI Summary
Addressing challenges in nanoscale wafer defect detection for integrated circuit manufacturing—including complex backgrounds, diverse defect textures, severe label scarcity, and poor model transferability—this paper proposes a lightweight few-shot learning framework for high-accuracy defect classification and pixel-level segmentation. The core contribution is the first semiconductor-domain-adaptive CLIP architecture, incorporating domain-knowledge-informed text prompts, background suppression mechanisms, and vision-language alignment strategies. Additionally, we design a text-guided feature engineering module and a few-shot segmentation fine-tuning framework. Evaluated under extreme label scarcity (1–5 annotated samples per class), our method achieves a 23.6% improvement in classification accuracy and a segmentation mIoU of 78.4%, substantially outperforming existing few-shot approaches. It significantly reduces annotation overhead while enhancing cross-scene generalization capability.
📝 Abstract
In the field of integrated circuit manufacturing, the detection and classification of nanoscale wafer defects are critical for subsequent root cause analysis and yield enhancement. The complex background patterns observed in scanning electron microscope (SEM) images and the diverse textures of the defects pose significant challenges. Traditional methods usually suffer from insufficient data, labels, and poor transferability. In this paper, we propose a novel few-shot learning approach, SEM-CLIP, for accurate defect classification and segmentation. SEM-CLIP customizes the Contrastive Language-Image Pretraining (CLIP) model to better focus on defect areas and minimize background distractions, thereby enhancing segmentation accuracy. We employ text prompts enriched with domain knowledge as prior information to assist in precise analysis. Additionally, our approach incorporates feature engineering with textual guidance to categorize defects more effectively. SEM-CLIP requires little annotated data, substantially reducing labor demands in the semiconductor industry. Extensive experimental validation demonstrates that our model achieves impressive classification and segmentation results under few-shot learning scenarios.