Knowledge-based anomaly detection for identifying network-induced shape artifacts

๐Ÿ“… 2025-11-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Synthetic medical images often exhibit anatomical shape artifacts introduced by generative models, compromising model generalizability and clinical trustworthiness. Method: We propose a two-stage anomaly detection framework grounded in anatomical priors. First, we construct a boundary-angle gradient distribution feature space; then, we integrate an angle-gradient feature extractor with Isolation Forest to enable unsupervised, single-image-level detection of subtle morphological anomalies. Contribution/Results: Our core innovation lies in formalizing anatomical plausibility as a quantifiable geometric distribution constraint, explicitly embedded into the anomaly detection pipeline. Evaluated on two synthetic mammography datasets, the method achieves AUCs of 0.97 and 0.91. Visually, the most anomalous regions align strongly with human-annotated artifacts, and humanโ€“machine evaluation consistency significantly outperforms baseline methods. This framework enhances the anatomical fidelity and clinical applicability of synthetic medical data.

Technology Category

Application Category

๐Ÿ“ Abstract
Synthetic data provides a promising approach to address data scarcity for training machine learning models; however, adoption without proper quality assessments may introduce artifacts, distortions, and unrealistic features that compromise model performance and clinical utility. This work introduces a novel knowledge-based anomaly detection method for detecting network-induced shape artifacts in synthetic images. The introduced method utilizes a two-stage framework comprising (i) a novel feature extractor that constructs a specialized feature space by analyzing the per-image distribution of angle gradients along anatomical boundaries, and (ii) an isolation forest-based anomaly detector. We demonstrate the effectiveness of the method for identifying network-induced shape artifacts in two synthetic mammography datasets from models trained on CSAW-M and VinDr-Mammo patient datasets respectively. Quantitative evaluation shows that the method successfully concentrates artifacts in the most anomalous partition (1st percentile), with AUC values of 0.97 (CSAW-syn) and 0.91 (VMLO-syn). In addition, a reader study involving three imaging scientists confirmed that images identified by the method as containing network-induced shape artifacts were also flagged by human readers with mean agreement rates of 66% (CSAW-syn) and 68% (VMLO-syn) for the most anomalous partition, approximately 1.5-2 times higher than the least anomalous partition. Kendall-Tau correlations between algorithmic and human rankings were 0.45 and 0.43 for the two datasets, indicating reasonable agreement despite the challenging nature of subtle artifact detection. This method is a step forward in the responsible use of synthetic data, as it allows developers to evaluate synthetic images for known anatomic constraints and pinpoint and address specific issues to improve the overall quality of a synthetic dataset.
Problem

Research questions and friction points this paper is trying to address.

Detecting network-induced shape artifacts in synthetic medical images
Addressing data scarcity issues through synthetic data quality assessment
Identifying unrealistic features compromising model performance and clinical utility
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage framework with specialized feature extractor
Analyzes angle gradients along anatomical boundaries
Uses isolation forest for anomaly detection
๐Ÿ”Ž Similar Papers
No similar papers found.
R
Rucha Deshpande
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA
T
Tahsin Rahman
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA
M
Miguel Lago
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA
A
Adarsh Subbaswamy
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA
J
Jana G. Delfino
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA
Ghada Zamzmi
Ghada Zamzmi
FDA/CDRH/OSEL/DIDSR
Artificial IntelligenceMachine LearningComputer VisionAffective ComputingMedical Imaging
E
Elim Thompson
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA
Aldo Badano
Aldo Badano
FDA
medical imagingin silico imaging trials
S
Seyed Kahaki
Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U. S. Food and Drug Administration, USA