🤖 AI Summary
This study addresses the challenge of predicting spatiotemporal patterns of *Sog-D* gene activity in *Drosophila* embryonic development along the anterior–posterior (AP) and dorsal–ventral (DV) axes at future time points. To this end, we propose a novel integrative method combining spatial point process statistics with machine learning, specifically embedding Ripley’s K-function—a measure of spatial clustering—into the XGBoost framework. This integration endows the model with RNA velocity–like capacity for inferring spatial gene expression dynamics from single-molecule-resolution spatial transcriptomic time-series data. Our approach achieves the first accurate, super-resolution, whole-embryo-scale modeling of developmental trajectories. It significantly outperforms existing benchmarks across multiple embryonic stages, yielding substantial improvements in average prediction accuracy. Crucially, it enables robust inference of future spatial gene expression patterns from a single time-point measurement—thereby filling a critical technical gap in high spatiotemporal-resolution predictive modeling of gene expression evolution in developmental biology.
📝 Abstract
In this paper, we introduce a pipeline based on XGboost to predict the future distribution of cells that are expressed by the Sog-D gene (active cells) in both the Anterior to posterior (AP) and the Dorsal to Ventral (DV) axis of the Drosophila in embryogenesis process. This method provides insights about how cells and living organisms control gene expression in super resolution whole embryo spatial transcriptomics imaging at sub cellular, single molecule resolution. An XGboost model was used to predict the next stage active distribution based on the previous one. To achieve this goal, we leveraged temporally resolved, spatial point processes by including Ripley's K-function in conjunction with the cell's state in each stage of embryogenesis, and found average predictive accuracy of active cell distribution. This tool is analogous to RNA Velocity for spatially resolved developmental biology, from one data point we can predict future spatially resolved gene expression using features from the spatial point processes.