Perturb-and-Restore: Simulation-driven Structural Augmentation Framework for Imbalance Chromosomal Anomaly Detection

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of detecting chromosomal structural abnormalities, a task hindered by the scarcity of real abnormal samples and extreme class imbalance, which severely limits deep learning model performance. To overcome this, the authors propose a simulation-driven structural augmentation framework that generates synthetic abnormalities by perturbing banding patterns of normal chromosomes and employs a diffusion network to restore their structural continuity. An energy score–guided adaptive sampling mechanism is introduced to dynamically select high-fidelity synthetic samples during training, eliminating the need for real abnormal data. Evaluated on a dataset of 260,000 chromosome images, the method achieves state-of-the-art performance, improving average sensitivity, precision, and F1-score by 8.92%, 8.89%, and 13.79%, respectively, and represents the first application of energy distribution–guided dynamic sampling in chromosomal abnormality detection.
📝 Abstract
Detecting structural chromosomal abnormalities is crucial for accurate diagnosis and management of genetic disorders. However, collecting sufficient structural abnormality data is extremely challenging and costly in clinical practice, and not all abnormal types can be readily collected. As a result, deep learning approaches face significant performance degradation due to the severe imbalance and scarcity of abnormal chromosome data. To address this challenge, we propose a Perturb-and-Restore (P&R), a simulation-driven structural augmentation framework that effectively alleviates data imbalance in chromosome anomaly detection. The P&R framework comprises two key components: (1) Structure Perturbation and Restoration Simulation, which generates synthetic abnormal chromosomes by perturbing chromosomal banding patterns of normal chromosomes followed by a restoration diffusion network that reconstructs continuous chromosome content and edges, thus eliminating reliance on rare abnormal samples; and (2) Energy-guided Adaptive Sampling, an energy score-based online selection strategy that dynamically prioritizes high-quality synthetic samples by referencing the energy distribution of real samples. To evaluate our method, we construct a comprehensive structural anomaly dataset consisting of over 260,000 chromosome images, including 4,242 abnormal samples spanning 24 categories. Experimental results demonstrate that the P&R framework achieves state-of-the-art (SOTA) performance, surpassing existing methods with an average improvement of 8.92% in sensitivity, 8.89% in precision, and 13.79% in F1-score across all categories.
Problem

Research questions and friction points this paper is trying to address.

chromosomal anomaly detection
data imbalance
structural abnormalities
deep learning
data scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Perturb-and-Restore
structural chromosomal anomaly detection
simulation-driven augmentation
diffusion restoration network
energy-guided sampling
🔎 Similar Papers
No similar papers found.
Yilan Zhang
Yilan Zhang
King Abdullah University of Science and Technology
Computer VisionMedical Image Analysis
H
Hanbiao Chen
Guangdong Provincial Maternal and Child Health Hospital, Guangzhou 511442, China
Changchun Yang
Changchun Yang
KAUST
medical image analysisimaging science
Y
Yuetan Chu
Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia; Center of Excellence on Smart Health (KCSH) and Center of Excellence for Generative AI, KAUST, Thuwal 23955-6900, Saudi Arabia
Siyuan Chen
Siyuan Chen
KING ABDULLAH UNIVERSITY/SCIENCE/TECHNOL
Deep Learning Bioinformatics
J
Jing Wu
Guangdong Provincial Maternal and Child Health Hospital, Guangzhou 511442, China
J
Jingdong Hu
Smiltec, Suzhou 215125, China
N
Na Li
Smiltec, Suzhou 215125, China
J
Junkai Su
Smiltec, Suzhou 215125, China
Y
Yuxuan Chen
Smiltec, Suzhou 215125, China
A
Ao Xu
Smiltec, Suzhou 215125, China
Xin Gao
Xin Gao
Chair and Professor, Computer Sciences; Co-Chair, Center of Excellence for Smart Health
Bioinformaticscomputational biologymachine learning
A
Aihua Yin
Guangdong Provincial Maternal and Child Health Hospital, Guangzhou 511442, China