Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining

๐Ÿ“… 2025-06-05
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Current automatic IVF embryo selection methods suffer from two key limitations: (1) reliance on localized morphological features without holistic quality assessment, or (2) dependence on clinical pregnancy outcomesโ€”highly confounded by non-embryonic factors. To address this, we introduce the novel task of holistic embryo quality grading directly from full-time-lapse microscopy (TLM) videos. We curate a large-scale, clinically validated dataset comprising over 2,500 real TLM videos with expert annotations. We propose CoSTeM, the first end-to-end video-level framework for holistic grading, which jointly models static morphology and dynamic developmental trajectories. CoSTeM integrates a mixture-of-experts layer with cross-attention fusion, a temporal selection module, and a time-aware Transformer to enable spatiotemporal complementary representation learning. Evaluated on real-world clinical data, CoSTeM significantly outperforms state-of-the-art methods, delivering an interpretable, deployable AI solution for embryo screening. Both code and dataset will be publicly released.

Technology Category

Application Category

๐Ÿ“ Abstract
Artificial intelligence has recently shown promise in automated embryo selection for In-Vitro Fertilization (IVF). However, current approaches either address partial embryo evaluation lacking holistic quality assessment or target clinical outcomes inevitably confounded by extra-embryonic factors, both limiting clinical utility. To bridge this gap, we propose a new task called Video-Based Embryo Grading - the first paradigm that directly utilizes full-length time-lapse monitoring (TLM) videos to predict embryologists' overall quality assessments. To support this task, we curate a real-world clinical dataset comprising over 2,500 TLM videos, each annotated with a grading label indicating the overall quality of embryos. Grounded in clinical decision-making principles, we propose a Complementary Spatial-Temporal Pattern Mining (CoSTeM) framework that conceptually replicates embryologists' evaluation process. The CoSTeM comprises two branches: (1) a morphological branch using a Mixture of Cross-Attentive Experts layer and a Temporal Selection Block to select discriminative local structural features, and (2) a morphokinetic branch employing a Temporal Transformer to model global developmental trajectories, synergistically integrating static and dynamic determinants for grading embryos. Extensive experimental results demonstrate the superiority of our design. This work provides a valuable methodological framework for AI-assisted embryo selection. The dataset and source code will be publicly available upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

Automated embryo grading using full-length time-lapse videos
Integrating static and dynamic features for holistic assessment
Addressing limitations of partial evaluation and confounding factors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Time-lapse video-based embryo grading
Complementary Spatial-Temporal Pattern Mining
Mixture of Cross-Attentive Experts layer
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yong Sun
The Hong Kong University of Science and Technology (Guangzhou)
Yipeng Wang
Yipeng Wang
College of Computer Science, Beijing University of Technology
Computer NetworksArtificial Intelligence
J
Junyu Shi
The Hong Kong University of Science and Technology (Guangzhou)
Z
Zhiyuan Zhang
The Hong Kong University of Science and Technology (Guangzhou)
Y
Yanmei Xiao
Center for Reproductive Medicine, Guangdong Second Provincial General Hospital
L
Lei Zhu
The Hong Kong University of Science and Technology (Guangzhou)
M
Manxi Jiang
Center for Reproductive Medicine, Guangdong Second Provincial General Hospital
Qiang Nie
Qiang Nie
Assistant Professor, Hong Kong University of Science and Technology, Guangzhou, China
roboticshuman-robot interactionartificial intelligencecomputer vision