Toward Generalist Semi-supervised Regression via Decoupled Representation Distillation

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Semi-supervised regression (SSR) suffers from two key challenges: high sensitivity to pseudo-label quality and a propensity for overfitting in direct regression. To address these, we propose DRILL—a novel framework that reformulates continuous regression as discrete distribution estimation and introduces a decoupled representation distillation architecture. DRILL employs a teacher-student joint learning paradigm with a decoupled distribution alignment mechanism: it separately aligns target and non-target bin distributions, thereby mitigating pseudo-label bias while preserving consistency regularization. This design enhances the stability and robustness of knowledge transfer. Crucially, DRILL performs end-to-end optimization of discrete distribution predictions without requiring post-hoc calibration. Extensive experiments across diverse benchmark datasets demonstrate that DRILL consistently outperforms state-of-the-art SSR methods, validating its strong generalization capability and superior performance.

Technology Category

Application Category

📝 Abstract

Semi-supervised regression (SSR), which aims to predict continuous scores of samples while reducing reliance on a large amount of labeled data, has recently received considerable attention across various applications, including computer vision, natural language processing, and audio and medical analysis. Existing semi-supervised methods typically apply consistency regularization on the general regression task by generating pseudo-labels. However, these methods heavily rely on the quality of pseudo-labels, and direct regression fails to learn the label distribution and can easily lead to overfitting. To address these challenges, we introduce an end-to-end Decoupled Representation distillation framework (DRILL) which is specially designed for the semi-supervised regression task where we transform the general regression task into a Discrete Distribution Estimation (DDE) task over multiple buckets to better capture the underlying label distribution and mitigate the risk of overfitting associated with direct regression. Then we employ the Decoupled Distribution Alignment (DDA) to align the target bucket and non-target bucket between teacher and student on the distribution of buckets, encouraging the student to learn more robust and generalized knowledge from the teacher. Extensive experiments conducted on datasets from diverse domains demonstrate that the proposed DRILL has strong generalization and outperforms the competing methods.

Problem

Research questions and friction points this paper is trying to address.

Addresses overfitting in semi-supervised regression tasks

Transforms regression into discrete distribution estimation problem

Aligns teacher-student bucket distributions for robust learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Distribution Estimation over buckets

Decoupled Distribution Alignment method

End-to-end representation distillation framework

🔎 Similar Papers

No similar papers found.