CHAMMI-75: pre-training multi-channel models with heterogeneous microscopy images

📅 2025-12-23

📈 Citations: 0

✨ Influential: 0

career value

248K/year

🤖 AI Summary

Existing cell morphology quantification models are typically trained on single microscopy modalities, exhibiting poor generalizability across technical platforms (e.g., varying channel counts) and biological contexts. Method: We introduce CHAMMI-75—the first open, multi-channel pretraining dataset comprising 75 heterogeneous biological studies—enabling systematic integration of cross-platform, multi-channel, and multimodal microscopic images. We propose a channel-agnostic pretraining paradigm incorporating heterogeneous image registration, channel-wise normalization, and modality-aware multi-scale augmentation. Contribution/Results: Our approach significantly improves model robustness and generalization across diverse downstream tasks, empirically validating the critical role of modality diversity in cell morphology modeling. This work establishes a novel paradigm and provides essential resources for developing reusable, scalable foundation models in bioimage analysis.

Technology Category

Application Category

📝 Abstract

Quantifying cell morphology using images and machine learning has proven to be a powerful tool to study the response of cells to treatments. However, models used to quantify cellular morphology are typically trained with a single microscopy imaging type. This results in specialized models that cannot be reused across biological studies because the technical specifications do not match (e.g., different number of channels), or because the target experimental conditions are out of distribution. Here, we present CHAMMI-75, an open access dataset of heterogeneous, multi-channel microscopy images from 75 diverse biological studies. We curated this resource from publicly available sources to investigate cellular morphology models that are channel-adaptive and can process any microscopy image type. Our experiments show that training with CHAMMI-75 can improve performance in multi-channel bioimaging tasks primarily because of its high diversity in microscopy modalities. This work paves the way to create the next generation of cellular morphology models for biological studies.

Problem

Research questions and friction points this paper is trying to address.

Develops channel-adaptive models for diverse microscopy image types

Addresses limitations of specialized models in cross-study reusability

Enhances multi-channel bioimaging performance through dataset diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-training multi-channel models with heterogeneous microscopy images

Creating channel-adaptive models for any microscopy image type

Using high-diversity dataset to improve multi-channel bioimaging performance

🔎 Similar Papers

BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs