🤖 AI Summary
This work addresses the challenge of learning general-purpose feature representations from unlabeled CAD boundary representation (BRep) models to support downstream tasks such as part classification, modeling segmentation, and machining feature recognition. To this end, the authors propose the first self-supervised learning framework that integrates a masked graph autoencoder with a hierarchical graph Transformer. The approach randomly masks geometric and attribute information in BRep graphs and reconstructs them, while introducing a cross-scale mutual attention mechanism to capture long-range geometric dependencies and effectively fuse local details with global structural context. Extensive experiments demonstrate that the method significantly outperforms existing models across multiple downstream tasks, exhibiting exceptional generalization capability and practical utility, particularly in few-shot scenarios.
📝 Abstract
We introduce a novel self-supervised learning framework that automatically learns representations from input computer-aided design (CAD) models for downstream tasks, including part classification, modeling segmentation, and machining feature recognition. To train our network, we construct a large-scale, unlabeled dataset of boundary representation (BRep) models. The success of our algorithm relies on two keycomponents. The first is a masked graph autoencoder that reconstructs randomly masked geometries and attributes of BReps for representation learning to enhance the generalization. The second is a hierarchical graph Transformer architecture that elegantly fuses global and local learning by a cross-scale mutual attention block to model long-range geometric dependencies and a graph neural network block to aggregate local topological information. After training the autoencoder, we replace its decoder with a task-specific network trained on a small amount of labeled data for downstream tasks. We conduct experiments on various tasks and achieve high performance, even with a small amount of labeled data, demonstrating the practicality and generalizability of our model. Compared to other methods, our model performs significantly better on downstream tasks with the same amount of training data, particularly when the training data is very limited.