Learning to Score

📅 2025-04-19

📈 Citations: 1

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the problem of unsupervised continuous severity scoring under label scarcity or ambiguous label definitions—e.g., ill-defined clinical disease progression criteria. We propose a novel unsupervised scoring framework integrating representation learning, side-information modeling, and metric learning. For the first time, we formalize clinical symptoms and domain-specific constraints as semantic constraints or auxiliary signals, and design an end-to-end trainable semantic triplet architecture that eliminates reliance on explicit labels. Our method introduces a constraint-aware loss function that jointly optimizes structured side-information encoding and pairwise/triplet metric learning. Evaluated on standard benchmarks and real-world biomedical electronic health records, the approach significantly outperforms baselines: the learned severity scores achieve high concordance with clinical assessments (Pearson *r* > 0.82), while demonstrating strong interpretability and cross-institutional generalizability.

Technology Category

Application Category

📝 Abstract

Common machine learning settings range from supervised tasks, where accurately labeled data is accessible, through semi-supervised and weakly-supervised tasks, where target labels are scant or noisy, to unsupervised tasks where labels are unobtainable. In this paper we study a scenario where the target labels are not available but additional related information is at hand. This information, referred to as Side Information, is either correlated with the unknown labels or imposes constraints on the feature space. We formulate the problem as an ensemble of three semantic components: representation learning, side information and metric learning. The proposed scoring model is advantageous for multiple use-cases. For example, in the healthcare domain it can be used to create a severity score for diseases where the symptoms are known but the criteria for the disease progression are not well defined. We demonstrate the utility of the suggested scoring system on well-known benchmark data-sets and bio-medical patient records.

Problem

Research questions and friction points this paper is trying to address.

Develops scoring model without target labels using side information

Integrates representation learning, side information, and metric learning

Applies scoring system to healthcare and benchmark datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes Side Information for label correlation

Combines representation and metric learning

Applies ensemble of three semantic components

🔎 Similar Papers

Revealing the learning process in reinforcement learning agents through attention-oriented metrics