🤖 AI Summary
To address spatial information loss from conventional ROI-based dimensionality reduction in whole-brain calcium imaging, this paper proposes the first end-to-end 3D video prediction framework that directly models neuronal dynamics from high-resolution 3D voxel sequences. Methodologically, it integrates large-receptive-field 3D convolutions with a Transformer architecture to explicitly capture long-range spatiotemporal dependencies across brain regions, and incorporates self-supervised pretraining to enhance generalization. Evaluated on the zebrafish whole-brain prediction benchmark ZAPBench, our approach significantly outperforms traditional 1D trajectory-based methods, demonstrating the critical accuracy gain from preserving native 3D spatial structure. Key contributions include: (i) introducing volumetric video prediction as a novel paradigm for whole-brain activity modeling—bypassing trace compression bottlenecks; and (ii) systematically characterizing the interplay between spatiotemporal modeling design choices and pretraining strategies in neural activity prediction performance.
📝 Abstract
Large-scale neuronal activity recordings with fluorescent calcium indicators are increasingly common, yielding high-resolution 2D or 3D videos. Traditional analysis pipelines reduce this data to 1D traces by segmenting regions of interest, leading to inevitable information loss. Inspired by the success of deep learning on minimally processed data in other domains, we investigate the potential of forecasting neuronal activity directly from volumetric videos. To capture long-range dependencies in high-resolution volumetric whole-brain recordings, we design a model with large receptive fields, which allow it to integrate information from distant regions within the brain. We explore the effects of pre-training and perform extensive model selection, analyzing spatio-temporal trade-offs for generating accurate forecasts. Our model outperforms trace-based forecasting approaches on ZAPBench, a recently proposed benchmark on whole-brain activity prediction in zebrafish, demonstrating the advantages of preserving the spatial structure of neuronal activity.