SemiETPicker: Fast and Label-Efficient Particle Picking for CryoET Tomography Using Semi-Supervised Learning

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Cryo-electron tomography (CryoET) particle picking heavily relies on labor-intensive manual annotation, leaving vast amounts of unlabeled data underutilized and limiting scalability. To address this label scarcity, we propose a label-efficient semi-supervised learning framework. First, we design an end-to-end heatmap-based detection model trained under keypoint detection supervision for precise subtomogram localization. Second, we introduce a teacher–student collaborative training scheme augmented with multi-view pseudo-label generation to enforce geometric and consistency constraints across tilted views. Third, we develop a CryoET-specific DropBlock augmentation tailored to the anisotropic noise and missing-wedge artifacts inherent in tomographic data. Evaluated under extreme label sparsity (e.g., only 1% annotated samples), our method achieves a 10% F1-score improvement over fully supervised baselines on the large-scale CZII dataset, significantly enhancing both unlabeled data utilization and automated particle picking efficiency.

Technology Category

Application Category

📝 Abstract

Cryogenic Electron Tomography (CryoET) combined with sub-volume averaging (SVA) is the only imaging modality capable of resolving protein structures inside cells at molecular resolution. Particle picking, the task of localizing and classifying target proteins in 3D CryoET volumes, remains the main bottleneck. Due to the reliance on time-consuming manual labels, the vast reserve of unlabeled tomograms remains underutilized. In this work, we present a fast, label-efficient semi-supervised framework that exploits this untapped data. Our framework consists of two components: (i) an end-to-end heatmap-supervised detection model inspired by keypoint detection, and (ii) a teacher-student co-training mechanism that enhances performance under sparse labeling conditions. Furthermore, we introduce multi-view pseudo-labeling and a CryoET-specific DropBlock augmentation strategy to further boost performance. Extensive evaluations on the large-scale CZII dataset show that our approach improves F1 by 10% over supervised baselines, underscoring the promise of semi-supervised learning for leveraging unlabeled CryoET data.

Problem

Research questions and friction points this paper is trying to address.

Automates particle localization in CryoET volumes using limited labels

Enhances detection accuracy through semi-supervised teacher-student training

Improves protein structure resolution by leveraging unlabeled tomogram data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-supervised learning framework for particle picking

Teacher-student co-training with sparse labeling

Multi-view pseudo-labeling and CryoET-specific augmentation

🔎 Similar Papers

FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer