🤖 AI Summary
To address the challenges of scarce labeled samples and poor cross-device generalization in rotating machinery fault diagnosis, this paper proposes a multi-attention meta-learning Transformer framework. The method employs an unsupervised time-frequency domain encoder to extract discriminative fault features, and integrates randomized data augmentation, contrastive learning, and meta-learning to enable effective knowledge transfer across devices under extremely low labeling rates (only 1%). Innovatively, multi-head attention mechanisms are embedded within a meta-learning-enabled Transformer architecture to construct a highly transferable few-shot diagnostic model. Evaluated on bearing and rotor datasets, the framework achieves 99% classification accuracy—substantially outperforming state-of-the-art approaches—and demonstrates superior generalization capability and practical engineering applicability.
📝 Abstract
The intelligent fault diagnosis of rotating mechanical equipment usually requires a large amount of labeled sample data. However, in practical industrial applications, acquiring enough data is both challenging and expensive in terms of time and cost. Moreover, different types of rotating mechanical equipment with different unique mechanical properties, require separate training of diagnostic models for each case. To address the challenges of limited fault samples and the lack of generalizability in prediction models for practical engineering applications, we propose a Multi-Attention Meta Transformer method for few-shot unsupervised rotating machinery fault diagnosis (MMT-FD). This framework extracts potential fault representations from unlabeled data and demonstrates strong generalization capabilities, making it suitable for diagnosing faults across various types of mechanical equipment. The MMT-FD framework integrates a time-frequency domain encoder and a meta-learning generalization model. The time-frequency domain encoder predicts status representations generated through random augmentations in the time-frequency domain. These enhanced data are then fed into a meta-learning network for classification and generalization training, followed by fine-tuning using a limited amount of labeled data. The model is iteratively optimized using a small number of contrastive learning iterations, resulting in high efficiency. To validate the framework, we conducted experiments on a bearing fault dataset and rotor test bench data. The results demonstrate that the MMT-FD model achieves 99% fault diagnosis accuracy with only 1% of labeled sample data, exhibiting robust generalization capabilities.