Subtyping Breast Lesions via Generative Augmentation based Long-tailed Recognition in Ultrasound

📅 2025-07-30

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Lesion subtype recognition in breast ultrasound suffers from severe data imbalance due to long-tailed class distributions. To address this, we propose a two-stage adaptive framework: (1) a sketch-guided controllable generative network incorporating anatomical priors to ensure category fidelity of synthesized images; and (2) a reinforcement learning–driven multi-agent sampler that dynamically optimizes the ratio of real to synthetic samples during training. Additionally, we introduce an unlabeled inference mechanism to enhance generalization. Evaluated on both private long-tailed and public imbalanced breast ultrasound datasets, our method outperforms existing state-of-the-art approaches, achieving an average 4.2% improvement in F1-score. This demonstrates the effectiveness of synergistically combining generative data augmentation with adaptive sampling to mitigate long-tail bias.

Technology Category

Application Category

📝 Abstract

Accurate identification of breast lesion subtypes can facilitate personalized treatment and interventions. Ultrasound (US), as a safe and accessible imaging modality, is extensively employed in breast abnormality screening and diagnosis. However, the incidence of different subtypes exhibits a skewed long-tailed distribution, posing significant challenges for automated recognition. Generative augmentation provides a promising solution to rectify data distribution. Inspired by this, we propose a dual-phase framework for long-tailed classification that mitigates distributional bias through high-fidelity data synthesis while avoiding overuse that corrupts holistic performance. The framework incorporates a reinforcement learning-driven adaptive sampler, dynamically calibrating synthetic-real data ratios by training a strategic multi-agent to compensate for scarcities of real data while ensuring stable discriminative capability. Furthermore, our class-controllable synthetic network integrates a sketch-grounded perception branch that harnesses anatomical priors to maintain distinctive class features while enabling annotation-free inference. Extensive experiments on an in-house long-tailed and a public imbalanced breast US datasets demonstrate that our method achieves promising performance compared to state-of-the-art approaches. More synthetic images can be found at https://github.com/Stinalalala/Breast-LT-GenAug.

Problem

Research questions and friction points this paper is trying to address.

Addresses long-tailed distribution in breast lesion subtype classification

Uses generative augmentation to mitigate data imbalance

Ensures stable discriminative capability with adaptive synthetic data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative augmentation for long-tailed data

Reinforcement learning-driven adaptive sampler

Class-controllable synthetic network with anatomical priors

🔎 Similar Papers

Exploiting Precision Mapping and Component-Specific Feature Enhancement for Breast Cancer Segmentation and Identification