Learning to Score

๐Ÿ“… 2025-04-19
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the problem of unsupervised continuous severity scoring under label scarcity or ambiguous label definitionsโ€”e.g., ill-defined clinical disease progression criteria. We propose a novel unsupervised scoring framework integrating representation learning, side-information modeling, and metric learning. For the first time, we formalize clinical symptoms and domain-specific constraints as semantic constraints or auxiliary signals, and design an end-to-end trainable semantic triplet architecture that eliminates reliance on explicit labels. Our method introduces a constraint-aware loss function that jointly optimizes structured side-information encoding and pairwise/triplet metric learning. Evaluated on standard benchmarks and real-world biomedical electronic health records, the approach significantly outperforms baselines: the learned severity scores achieve high concordance with clinical assessments (Pearson *r* > 0.82), while demonstrating strong interpretability and cross-institutional generalizability.

Technology Category

Application Category

๐Ÿ“ Abstract
Common machine learning settings range from supervised tasks, where accurately labeled data is accessible, through semi-supervised and weakly-supervised tasks, where target labels are scant or noisy, to unsupervised tasks where labels are unobtainable. In this paper we study a scenario where the target labels are not available but additional related information is at hand. This information, referred to as Side Information, is either correlated with the unknown labels or imposes constraints on the feature space. We formulate the problem as an ensemble of three semantic components: representation learning, side information and metric learning. The proposed scoring model is advantageous for multiple use-cases. For example, in the healthcare domain it can be used to create a severity score for diseases where the symptoms are known but the criteria for the disease progression are not well defined. We demonstrate the utility of the suggested scoring system on well-known benchmark data-sets and bio-medical patient records.
Problem

Research questions and friction points this paper is trying to address.

Develops scoring model without target labels using side information
Integrates representation learning, side information, and metric learning
Applies scoring system to healthcare and benchmark datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes Side Information for label correlation
Combines representation and metric learning
Applies ensemble of three semantic components
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yogev Kriger
Efi Arazi School of Computer Science, Reichman University
Shai Fine
Shai Fine
Head of the Data Science Institute, Reichamn University (IDC)