Rethinking the Mean Teacher Strategy from the Perspective of Self-paced Learning

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address the high annotation cost and underutilization of unlabeled data in semi-supervised medical image segmentation, this paper reformulates the Mean Teacher (MT) framework from a self-paced learning perspective—its first such adaptation. We propose a Dual Teacher-Student Learning (DTSL) framework coupled with a Jensen–Shannon (JS) divergence-driven Consensus Label Generator (CLG). By jointly modeling consistency between a time-delayed teacher and heterogeneous student outputs, DTSL dynamically modulates the learning pace to produce robust cross-architecture pseudo-labels. Integrating self-paced learning principles with consistency regularization, our method consistently outperforms state-of-the-art approaches across multiple benchmark medical imaging datasets. Ablation studies confirm the critical contributions of both DTSL and CLG, demonstrating substantial improvements in segmentation accuracy and generalization—particularly under low-labeling budgets.

Technology Category

Application Category

📝 Abstract

Semi-supervised medical image segmentation has attracted significant attention due to its potential to reduce manual annotation costs. The mean teacher (MT) strategy, commonly understood as introducing smoothed, temporally lagged consistency regularization, has demonstrated strong performance across various tasks in this field. In this work, we reinterpret the MT strategy on supervised data as a form of self-paced learning, regulated by the output agreement between the temporally lagged teacher model and the ground truth labels. This idea is further extended to incorporate agreement between a temporally lagged model and a cross-architectural model, which offers greater flexibility in regulating the learning pace and enables application to unlabeled data. Specifically, we propose dual teacher-student learning (DTSL), a framework that introduces two groups of teacher-student models with different architectures. The output agreement between the cross-group teacher and student models is used as pseudo-labels, generated via a Jensen-Shannon divergence-based consensus label generator (CLG). Extensive experiments on popular datasets demonstrate that the proposed method consistently outperforms existing state-of-the-art approaches. Ablation studies further validate the effectiveness of the proposed modules.

Problem

Research questions and friction points this paper is trying to address.

Reducing manual annotation costs in medical image segmentation

Reinterpreting mean teacher strategy as self-paced learning

Improving segmentation accuracy with dual teacher-student models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinterprets MT strategy as self-paced learning

Introduces dual teacher-student models with different architectures

Uses Jensen-Shannon divergence for consensus pseudo-labels

🔎 Similar Papers

Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review