Unsupervised Part Discovery via Descriptor-Based Masked Image Restoration with Optimized Constraints

πŸ“… 2025-07-16
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Unsupervised part discovery suffers from poor robustness across categories and under complex scenes due to the absence of fine-grained annotations. This paper proposes a descriptor-guided masked image reconstruction framework that reconstructs occluded parts using local appearance features from unmasked regions, enabling precise semantic part localization and segmentation. Our key contribution is a relaxed yet effective optimization constraint mechanism, integrating masked modeling, local feature–part descriptor matching, an autoencoder architecture, and contrastive learning. This enables stable, label-free identification of parts across diverse object categories, supporting occlusion-robust reconstruction and cross-category part similarity analysis. Experiments demonstrate that our method consistently discovers structurally coherent and semantically meaningful parts across multiple categories and complex scenes, significantly improving both accuracy and generalization capability in unsupervised part discovery.

Technology Category

Application Category

πŸ“ Abstract
Part-level features are crucial for image understanding, but few studies focus on them because of the lack of fine-grained labels. Although unsupervised part discovery can eliminate the reliance on labels, most of them cannot maintain robustness across various categories and scenarios, which restricts their application range. To overcome this limitation, we present a more effective paradigm for unsupervised part discovery, named Masked Part Autoencoder (MPAE). It first learns part descriptors as well as a feature map from the inputs and produces patch features from a masked version of the original images. Then, the masked regions are filled with the learned part descriptors based on the similarity between the local features and descriptors. By restoring these masked patches using the part descriptors, they become better aligned with their part shapes, guided by appearance features from unmasked patches. Finally, MPAE robustly discovers meaningful parts that closely match the actual object shapes, even in complex scenarios. Moreover, several looser yet more effective constraints are proposed to enable MPAE to identify the presence of parts across various scenarios and categories in an unsupervised manner. This provides the foundation for addressing challenges posed by occlusion and for exploring part similarity across multiple categories. Extensive experiments demonstrate that our method robustly discovers meaningful parts across various categories and scenarios. The code is available at the project https://github.com/Jiahao-UTS/MPAE.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised discovery of part-level features without fine-grained labels
Robust part discovery across diverse categories and scenarios
Addressing occlusion and part similarity challenges via optimized constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Descriptor-based masked image restoration
Optimized constraints for robustness
Unsupervised part discovery across categories
πŸ”Ž Similar Papers
No similar papers found.
Jiahao Xia
Jiahao Xia
Research Fellow, University of Technology Sydney
Deep Learning
Y
Yike Wu
Faculty of Engineering and IT, University of Technology Sydney
Wenjian Huang
Wenjian Huang
Peking University
BioMedical Image&Signal ProcessingMachine LearningArtificial IntelligenceStatistical LearningComputer Vision
J
Jianguo Zhang
Dept. of Comp. Sci. and Eng., Southern University of Science and Technology
J
Jian Zhang
Faculty of Engineering and IT, University of Technology Sydney