π€ AI Summary
This work addresses the challenges posed by the complex morphology and ambiguous boundaries of neurons, which hinder accurate segmentation in electron microscopy data. Existing CNNs lack long-range modeling capabilities, while patch-based Transformers often lose fine voxel-level details. To overcome these limitations, the authors propose NeuroMamba, a framework built upon the Visual Mamba architecture that integrates global state-space modeling without patch partitioning with local feature extraction. The method enhances boundary discrimination through channel-wise gating, introduces a resolution-adaptive spatially continuous scanning mechanism, and employs cross-modulation to fuse multi-view features. Evaluated on four public electron microscopy datasets, NeuroMamba achieves state-of-the-art performance, significantly improving segmentation accuracy for both anisotropic and isotropic volumes.
π Abstract
Neuron segmentation is the cornerstone of reconstructing comprehensive neuronal connectomes, which is essential for deciphering the functional organization of the brain. The irregular morphology and densely intertwined structures of neurons make this task particularly challenging. Prevailing CNN-based methods often fail to resolve ambiguous boundaries due to the lack of long-range context, whereas Transformer-based methods suffer from boundary imprecision caused by the loss of voxel-level details during patch partitioning. To address these limitations, we propose NeuroMamba, a multi-perspective framework that exploits the linear complexity of Mamba to enable patch-free global modeling and synergizes this with complementary local feature modeling, thereby efficiently capturing long-range dependencies while meticulously preserving fine-grained voxel details. Specifically, we design a channel-gated Boundary Discriminative Feature Extractor (BDFE) to enhance local morphological cues. Complementing this, we introduce the Spatial Continuous Feature Extractor (SCFE), which integrates a resolution-aware scanning mechanism into the Visual Mamba architecture to adaptively model global dependencies across varying data resolutions. Finally, a cross-modulation mechanism synergistically fuses these multi-perspective features. Our method demonstrates state-of-the-art performance across four public EM datasets, validating its exceptional adaptability to both anisotropic and isotropic resolutions. The source code will be made publicly available.