🤖 AI Summary
AI development in medical imaging is hindered by data silos, fragmented tooling, and challenges in multi-institutional collaboration, resulting in poor reproducibility, limited scalability, and weakened clinical–research integration.
Method: We introduce Kaapana, an open-source platform featuring a novel “algorithm-to-data” distributed architecture that enables privacy-preserving, multi-center federated modeling without moving sensitive DICOM data across institutional boundaries. Built on modular microservices, it integrates Apache Airflow for workflow orchestration, a standardized DICOM protocol stack, a web-based UI, and RESTful APIs—unifying data ingestion, queue management, pipeline execution, and visualization.
Contribution/Results: Kaapana has scaled from single-site prototypes to a national imaging research network. It significantly improves experimental reproducibility, accelerates cross-institutional collaboration, and strengthens translational synergy between clinical practice and biomedical research.
📝 Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana