Toward Generalist Semi-supervised Regression via Decoupled Representation Distillation

πŸ“… 2025-08-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Semi-supervised regression (SSR) suffers from two key challenges: high sensitivity to pseudo-label quality and a propensity for overfitting in direct regression. To address these, we propose DRILLβ€”a novel framework that reformulates continuous regression as discrete distribution estimation and introduces a decoupled representation distillation architecture. DRILL employs a teacher-student joint learning paradigm with a decoupled distribution alignment mechanism: it separately aligns target and non-target bin distributions, thereby mitigating pseudo-label bias while preserving consistency regularization. This design enhances the stability and robustness of knowledge transfer. Crucially, DRILL performs end-to-end optimization of discrete distribution predictions without requiring post-hoc calibration. Extensive experiments across diverse benchmark datasets demonstrate that DRILL consistently outperforms state-of-the-art SSR methods, validating its strong generalization capability and superior performance.

Technology Category

Application Category

πŸ“ Abstract
Semi-supervised regression (SSR), which aims to predict continuous scores of samples while reducing reliance on a large amount of labeled data, has recently received considerable attention across various applications, including computer vision, natural language processing, and audio and medical analysis. Existing semi-supervised methods typically apply consistency regularization on the general regression task by generating pseudo-labels. However, these methods heavily rely on the quality of pseudo-labels, and direct regression fails to learn the label distribution and can easily lead to overfitting. To address these challenges, we introduce an end-to-end Decoupled Representation distillation framework (DRILL) which is specially designed for the semi-supervised regression task where we transform the general regression task into a Discrete Distribution Estimation (DDE) task over multiple buckets to better capture the underlying label distribution and mitigate the risk of overfitting associated with direct regression. Then we employ the Decoupled Distribution Alignment (DDA) to align the target bucket and non-target bucket between teacher and student on the distribution of buckets, encouraging the student to learn more robust and generalized knowledge from the teacher. Extensive experiments conducted on datasets from diverse domains demonstrate that the proposed DRILL has strong generalization and outperforms the competing methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses overfitting in semi-supervised regression tasks
Transforms regression into discrete distribution estimation problem
Aligns teacher-student bucket distributions for robust learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Distribution Estimation over buckets
Decoupled Distribution Alignment method
End-to-end representation distillation framework
πŸ”Ž Similar Papers
No similar papers found.
Y
Ye Su
Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences
Hezhe Qiao
Hezhe Qiao
Singapore Management University (SMU)
LLM Hallucination Detection/MitigationGraph Anomaly DetectionFoundation Model
W
Wei Huang
Beijing University of Posts and Telecommunications
L
Lin Chen
Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences