Rethinking the Mean Teacher Strategy from the Perspective of Self-paced Learning

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high annotation cost and underutilization of unlabeled data in semi-supervised medical image segmentation, this paper reformulates the Mean Teacher (MT) framework from a self-paced learning perspective—its first such adaptation. We propose a Dual Teacher-Student Learning (DTSL) framework coupled with a Jensen–Shannon (JS) divergence-driven Consensus Label Generator (CLG). By jointly modeling consistency between a time-delayed teacher and heterogeneous student outputs, DTSL dynamically modulates the learning pace to produce robust cross-architecture pseudo-labels. Integrating self-paced learning principles with consistency regularization, our method consistently outperforms state-of-the-art approaches across multiple benchmark medical imaging datasets. Ablation studies confirm the critical contributions of both DTSL and CLG, demonstrating substantial improvements in segmentation accuracy and generalization—particularly under low-labeling budgets.

Technology Category

Application Category

📝 Abstract
Semi-supervised medical image segmentation has attracted significant attention due to its potential to reduce manual annotation costs. The mean teacher (MT) strategy, commonly understood as introducing smoothed, temporally lagged consistency regularization, has demonstrated strong performance across various tasks in this field. In this work, we reinterpret the MT strategy on supervised data as a form of self-paced learning, regulated by the output agreement between the temporally lagged teacher model and the ground truth labels. This idea is further extended to incorporate agreement between a temporally lagged model and a cross-architectural model, which offers greater flexibility in regulating the learning pace and enables application to unlabeled data. Specifically, we propose dual teacher-student learning (DTSL), a framework that introduces two groups of teacher-student models with different architectures. The output agreement between the cross-group teacher and student models is used as pseudo-labels, generated via a Jensen-Shannon divergence-based consensus label generator (CLG). Extensive experiments on popular datasets demonstrate that the proposed method consistently outperforms existing state-of-the-art approaches. Ablation studies further validate the effectiveness of the proposed modules.
Problem

Research questions and friction points this paper is trying to address.

Reducing manual annotation costs in medical image segmentation
Reinterpreting mean teacher strategy as self-paced learning
Improving segmentation accuracy with dual teacher-student models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinterprets MT strategy as self-paced learning
Introduces dual teacher-student models with different architectures
Uses Jensen-Shannon divergence for consensus pseudo-labels
🔎 Similar Papers
No similar papers found.
P
Pengchen Zhang
Center for Applied Mathematics, Tianjin University
Alan J.X. Guo
Alan J.X. Guo
Center for Applied Mathematics, Tianjin Univ.
CombinatoricsDeep Learning
S
Sipin Luo
Department of Radiology, Tianjin Hospital of Tianjin University
Zhe Han
Zhe Han
King's College London
Medical Imaging
L
Lin Guo
Department of Radiology, Tianjin Hospital of Tianjin University