🤖 AI Summary
Conventional atomic-scale characterization of 2D materials heavily relies on extensively trained human experts and suffers from low accuracy and poor robustness in identifying novel structures. To address this, we propose the first zero-shot, fully automated optical microscopy characterization system—requiring no manual annotations or domain-specific pretraining—that enables cross-material, interference-resilient, closed-loop intelligent analysis. Methodologically, we introduce the first integration of a vision foundation model (Segment Anything Model, SAM) with a large language model (ChatGPT), augmented by unsupervised clustering and topological analysis, all orchestrated via prompt engineering to drive microscope control, image segmentation, and physical interpretation. Our system achieves 99.7% segmentation accuracy on monolayer MoS₂—matching expert-level performance—and detects sub-visible grain-boundary cracks imperceptible to the human eye. It demonstrates strong robustness against defocus, color-temperature shifts, and exposure variations, and generalizes across diverse 2D materials including graphene, WSe₂, and SnSe.
📝 Abstract
Characterization of atomic-scale materials traditionally requires human experts with months to years of specialized training. Even for trained human operators, accurate and reliable characterization remains challenging when examining newly discovered materials such as two-dimensional (2D) structures. This bottleneck drives demand for fully autonomous experimentation systems capable of comprehending research objectives without requiring large training datasets. In this work, we present ATOMIC (Autonomous Technology for Optical Microscopy&Intelligent Characterization), an end-to-end framework that integrates foundation models to enable fully autonomous, zero-shot characterization of 2D materials. Our system integrates the vision foundation model (i.e., Segment Anything Model), large language models (i.e., ChatGPT), unsupervised clustering, and topological analysis to automate microscope control, sample scanning, image segmentation, and intelligent analysis through prompt engineering, eliminating the need for additional training. When analyzing typical MoS2 samples, our approach achieves 99.7% segmentation accuracy for single layer identification, which is equivalent to that of human experts. In addition, the integrated model is able to detect grain boundary slits that are challenging to identify with human eyes. Furthermore, the system retains robust accuracy despite variable conditions including defocus, color temperature fluctuations, and exposure variations. It is applicable to a broad spectrum of common 2D materials-including graphene, MoS2, WSe2, SnSe-regardless of whether they were fabricated via chemical vapor deposition or mechanical exfoliation. This work represents the implementation of foundation models to achieve autonomous analysis, establishing a scalable and data-efficient characterization paradigm that fundamentally transforms the approach to nanoscale materials research.