A Multimodal 3D Foundation Model for Light Sheet Fluorescence Microscopy Enables Few-Shot Segmentation, Classification, and Deblurring

📅 2026-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges posed by light-sheet microscopy (LSM) data—namely its high dimensionality, large volume, high annotation cost, and the absence of modality-specific 3D foundation models—by introducing the first multimodal 3D foundation model tailored for LSM. Built upon a 3D Transformer architecture, the model jointly optimizes masked autoencoding and image-text contrastive learning on a large-scale, unlabeled dataset encompassing multiple species, staining protocols, and imaging conditions. This approach learns transferable voxel-level representations that significantly reduce reliance on labeled data and uniformly support diverse downstream tasks, including segmentation, classification, and deblurring. Evaluated under standard quantitative metrics and expert qualitative assessment, the proposed model consistently outperforms existing methods.
📝 Abstract
Light sheet fluorescence microscopy (LSM) enables high-resolution, three-dimensional (3D) imaging of biological specimens, providing rich volumetric data for studying cellular organization, pathology, and vascular networks. However, the size, dimensionality, and annotation burden of LSM data make supervised deep learning approaches costly and difficult to scale. Additionally, despite the abundance of unannotated LSM volumes, foundation models for this modality remain underexplored due to computational challenges and the complexity of volumetric representation learning. In this work, we introduce a 3D foundation model for LSM data, pretrained on a large curated collection of 3D images spanning multiple organisms, stains, and imaging protocols. We learn transferable volumetric representations by jointly optimizing for masked reconstruction and image-text alignment. The pretrained backbone drastically reduces the annotation burden, enabling efficient, few-shot adaptation for varied downstream tasks. We evaluate this approach on downstream segmentation, classification, and deblurring. Our results demonstrate consistent improvements over baselines, (1) when measured using standard evaluation metrics and (2) when rigorously assessed by domain experts. This highlights the potential of foundation model pretraining to reduce annotation requirements while improving performance across diverse LSM analysis tasks. Pretrained model weights and code for pretraining and finetuning are publicly available: https://github.com/AdinaScheinfeld/lsm_fm_public_repo.git.
Problem

Research questions and friction points this paper is trying to address.

Light sheet fluorescence microscopy
3D foundation model
annotation burden
volumetric representation learning
few-shot learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D foundation model
light sheet fluorescence microscopy
few-shot learning
masked reconstruction
image-text alignment
🔎 Similar Papers
No similar papers found.
A
Adina Scheinfeld
Tri-Institutional Program in Computational Biology & Medicine, Weill Cornell Medicine, New York, NY, USA; Department of Radiology, Weill Cornell Medicine, New York, NY, USA; Cornell Tech, New York, NY, USA
H
Haotan Zhang
Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA; Graduate Program in Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
S
Shang Mu
Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
R
Rudolf L. M. van Herten
Department of Radiology, Weill Cornell Medicine, New York, NY, USA; Cornell Tech, New York, NY, USA
Lucas Stoffl
Lucas Stoffl
Weill Cornell Medicine, Cornell Tech
Ali Ertürk
Ali Ertürk
director at Helmholtz Munich, prof at LMU (Munich University)
AIdeep learningtissue clearingneurodegenerationcancer
Z
Zhuhao Wu
Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
Johannes C. Paetzold
Johannes C. Paetzold
Cornell University, Weill Cornell Medicine
Machine LearningGeometric Deep LearningGenerative ModelsBiomedical Image Analysis