PySlyde: A Lightweight, Open-Source Toolkit for Pathology Preprocessing

📅 2025-11-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Whole-slide images (WSIs) pose significant challenges for preprocessing due to their ultra-high resolution and substantial staining/scanning variability, leading to fragmented, non-reproducible pipelines for tissue detection, tiling, stain normalization, and annotation parsing. To address this, we propose the first lightweight, open-source, unified WSI preprocessing framework built upon OpenSlide. It natively supports sliding-window loading, automatic tissue region identification, adaptive tiling, standardized stain normalization, and structured annotation parsing—while directly producing outputs compatible with mainstream pathology foundation models. Our framework substantially lowers the barrier to AI-ready data preparation, enhances reproducibility and efficiency of preprocessing, and accelerates AI-ready dataset generation by 3–5× in empirical evaluation. The implementation is publicly released to foster community adoption, integration, and extensibility.

Technology Category

Application Category

📝 Abstract
The integration of artificial intelligence (AI) into pathology is advancing precision medicine by improving diagnosis, treatment planning, and patient outcomes. Digitised whole-slide images (WSIs) capture rich spatial and morphological information vital for understanding disease biology, yet their gigapixel scale and variability pose major challenges for standardisation and analysis. Robust preprocessing, covering tissue detection, tessellation, stain normalisation, and annotation parsing is critical but often limited by fragmented and inconsistent workflows. We present PySlyde, a lightweight, open-source Python toolkit built on OpenSlide to simplify and standardise WSI preprocessing. PySlyde provides an intuitive API for slide loading, annotation management, tissue detection, tiling, and feature extraction, compatible with modern pathology foundation models. By unifying these processes, it streamlines WSI preprocessing, enhances reproducibility, and accelerates the generation of AI-ready datasets, enabling researchers to focus on model development and downstream analysis.
Problem

Research questions and friction points this paper is trying to address.

Standardizing gigapixel whole-slide image preprocessing workflows
Addressing tissue detection and stain normalization challenges in pathology
Integrating fragmented annotation and feature extraction processes efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source Python toolkit for pathology preprocessing
Intuitive API for slide loading and tissue detection
Unifies workflows to generate AI-ready datasets
🔎 Similar Papers
No similar papers found.
G
G. Verghese
PharosAI, King’s College London, London, WC2R 2LS, UK.
A
Anthony Baptista
School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, WC2R 2LS, UK.
C
Chima I. Eke
School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, WC2R 2LS, UK.
H
Holly Rafique
School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, WC2R 2LS, UK.
Mengyuan Li
Mengyuan Li
University of Southern California
Hardware SecurityTrusted Execution EnvironmentCloud computing
F
F. Mohamed
School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, WC2R 2LS, UK.
A
Ananya Bhalla
The Francis Crick Institute, London, NW1 1AT, UK.
L
Lucy Ryan
School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, WC2R 2LS, UK.
M
Michael Pitcher
PharosAI, King’s College London, London, WC2R 2LS, UK.
Enrico Parisini
Enrico Parisini
Research Associate, Turing Institute
Clinical AITranscriptomicsGenomicsGravityString Theory
C
C. Piazzese
Barts Life Sciences, Barts Health NHS Trust, London, E1 1BB, UK.
L
Liz Ing-Simmons
eResearch, King’s College London, London, WC2R 2LS, UK.
A
Ananya Grigoriadis
PharosAI, King’s College London, London, WC2R 2LS, UK.