Needle in a Haystack -- One-Class Representation Learning for Detecting Rare Malignant Cells in Computational Cytology

📅 2026-04-08

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This work addresses the challenge of detecting malignant cells in computational cytopathology, where positives are extremely rare (≤1%) and morphologically heterogeneous, rendering conventional methods ineffective due to severe class imbalance and scarce annotations. The authors propose a one-class representation learning framework trained exclusively on negative slides, leveraging deep one-class classification techniques—such as Deep Support Vector Data Description (DSVDD) and Deep Robust One-Class Classification (DROC)—augmented with distribution-aware contrastive learning. This approach learns compact representations of normal cells without requiring instance-level labels and enables effective anomaly detection. The study provides the first systematic validation of one-class learning under ultra-low positivity rates, demonstrating superior performance over fully supervised baselines and achieving state-of-the-art instance-level anomaly ranking on the TCIA bone marrow and oral cancer cytology datasets.

Technology Category

Application Category

📝 Abstract

In computational cytology, detecting malignancy on whole-slide images is difficult because malignant cells are morphologically diverse yet vanishingly rare amid a vast background of normal cells. Accurate detection of these extremely rare malignant cells remains challenging due to large class imbalance and limited annotations. Conventional weakly supervised approaches, such as multiple instance learning (MIL), often fail to generalize at the instance level, especially when the fraction of malignant cells (witness rate) is exceedingly low. In this study, we explore the use of one-class representation learning techniques for detecting malignant cells in low-witness-rate scenarios. These methods are trained exclusively on slide-negative patches, without requiring any instance-level supervision. Specifically, we evaluate two OCC approaches, DSVDD and DROC, and compare them with FS-SIL, WS-SIL, and the recent ItS2CLR method. The one-class methods learn compact representations of normality and detect deviations at test time. Experiments on a publicly available bone marrow cytomorphology dataset (TCIA) and an in-house oral cancer cytology dataset show that DSVDD achieves state-of-the-art performance in instance-level abnormality ranking, particularly in ultra-low witness-rate regimes ($\leq 1\%$) and, in some cases, even outperforming fully supervised learning, which is typically not a practical option in whole-slide cytology due to the infeasibility of exhaustive instance-level annotations. DROC is also competitive under extreme rarity, benefiting from distribution-augmented contrastive learning. These findings highlight one-class representation learning as a robust and interpretable superior choice to MIL for malignant cell detection under extreme rarity.

Problem

Research questions and friction points this paper is trying to address.

rare malignant cell detection

computational cytology

class imbalance

low witness rate

whole-slide image analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

one-class representation learning

rare malignant cell detection

computational cytology