CHAMMI-75: pre-training multi-channel models with heterogeneous microscopy images

📅 2025-12-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing cell morphology quantification models are typically trained on single microscopy modalities, exhibiting poor generalizability across technical platforms (e.g., varying channel counts) and biological contexts. Method: We introduce CHAMMI-75—the first open, multi-channel pretraining dataset comprising 75 heterogeneous biological studies—enabling systematic integration of cross-platform, multi-channel, and multimodal microscopic images. We propose a channel-agnostic pretraining paradigm incorporating heterogeneous image registration, channel-wise normalization, and modality-aware multi-scale augmentation. Contribution/Results: Our approach significantly improves model robustness and generalization across diverse downstream tasks, empirically validating the critical role of modality diversity in cell morphology modeling. This work establishes a novel paradigm and provides essential resources for developing reusable, scalable foundation models in bioimage analysis.

Technology Category

Application Category

📝 Abstract
Quantifying cell morphology using images and machine learning has proven to be a powerful tool to study the response of cells to treatments. However, models used to quantify cellular morphology are typically trained with a single microscopy imaging type. This results in specialized models that cannot be reused across biological studies because the technical specifications do not match (e.g., different number of channels), or because the target experimental conditions are out of distribution. Here, we present CHAMMI-75, an open access dataset of heterogeneous, multi-channel microscopy images from 75 diverse biological studies. We curated this resource from publicly available sources to investigate cellular morphology models that are channel-adaptive and can process any microscopy image type. Our experiments show that training with CHAMMI-75 can improve performance in multi-channel bioimaging tasks primarily because of its high diversity in microscopy modalities. This work paves the way to create the next generation of cellular morphology models for biological studies.
Problem

Research questions and friction points this paper is trying to address.

Develops channel-adaptive models for diverse microscopy image types
Addresses limitations of specialized models in cross-study reusability
Enhances multi-channel bioimaging performance through dataset diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-training multi-channel models with heterogeneous microscopy images
Creating channel-adaptive models for any microscopy image type
Using high-diversity dataset to improve multi-channel bioimaging performance
V
Vidit Agrawal
Morgridge Institute for Research, Madison, WI, USA
J
John Peters
Morgridge Institute for Research, Madison, WI, USA
T
Tyler N. Thompson
Morgridge Institute for Research, Madison, WI, USA
M
Mohammad Vali Sanian
Institute for Molecular Medicine Finland (FIMM), Helsinki, Finland
Chau Pham
Chau Pham
Boston University
Machine LearningComputer VisionVision-LanguageLarge Language Models
N
Nikita Moshkov
Institute of Computational Biology, Helmholtz Munich, Neuherberg, Germany
A
Arshad Kazi
Morgridge Institute for Research, Madison, WI, USA
A
Aditya Pillai
Morgridge Institute for Research, Madison, WI, USA
J
Jack Freeman
Morgridge Institute for Research, Madison, WI, USA
B
Byunguk Kang
Broad Institute of MIT and Harvard, Cambridge, MA, USA
S
Samouil L. Farhi
Broad Institute of MIT and Harvard, Cambridge, MA, USA
E
Ernest Fraenkel
Massachusetts Institute of Technology, Cambridge, MA, USA
R
Ron Stewart
Morgridge Institute for Research, Madison, WI, USA
Lassi Paavolainen
Lassi Paavolainen
Institute for Molecular Medicine Finland (FIMM), University of Helsinki
Image-based profilingDeep learningBioimage informaticsHigh-content analysis
B
Bryan A. Plummer
Boston University, Boston, MA, USA
Juan C. Caicedo
Juan C. Caicedo
University of Wisconsin-Madison
Computational BiologyMachine LearningComputer Vision