MambaMIC: An Efficient Baseline for Microscopic Image Classification with State Space Models

📅 2024-09-12
📈 Citations: 1
Influential: 0
📄 PDF

career value

270K/year
🤖 AI Summary
Microscope image classification (MIC) faces challenges including the trade-off between global modeling capability and computational efficiency, loss of fine-grained pixel-level information, channel redundancy, and insufficient local perception. To address these, we propose MambaMIC—a lightweight, efficient vision backbone. It introduces a novel local–global dual-branch aggregation module that synergistically integrates local convolutional perception with selective state space modeling (SSM). Furthermore, we design a local-aware enhancement filter and a feature modulation interaction aggregation mechanism to mitigate pixel-level forgetting and channel redundancy. Evaluated on five standard MIC benchmarks, MambaMIC achieves state-of-the-art accuracy with significantly fewer parameters and lower FLOPs, while delivering substantial inference speedup. The architecture demonstrates superior representational capacity and strong deployment efficiency, making it particularly suitable for resource-constrained biomedical imaging applications.

Technology Category

Application Category

📝 Abstract
In recent years, CNN and Transformer-based methods have made significant progress in Microscopic Image Classification (MIC). However, existing approaches still face the dilemma between global modeling and efficient computation. While the Selective State Space Model (SSM) can simulate long-range dependencies with linear complexity, it still encounters challenges in MIC, such as local pixel forgetting, channel redundancy, and lack of local perception. To address these issues, we propose a simple yet efficient vision backbone for MIC tasks, named MambaMIC. Specifically, we introduce a Local-Global dual-branch aggregation module: the MambaMIC Block, designed to effectively capture and fuse local connectivity and global dependencies. In the local branch, we use local convolutions to capture pixel similarity, mitigating local pixel forgetting and enhancing perception. In the global branch, SSM extracts global dependencies, while Locally Aware Enhanced Filter reduces channel redundancy and local pixel forgetting. Additionally, we design a Feature Modulation Interaction Aggregation Module for deep feature interaction and key feature re-localization. Extensive benchmarking shows that MambaMIC achieves state-of-the-art performance across five datasets. code is available at https://zs1314.github.io/MambaMIC
Problem

Research questions and friction points this paper is trying to address.

Addresses global modeling vs. efficient computation in MIC
Mitigates local pixel forgetting and channel redundancy
Enhances local perception and global dependency capture
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local-Global dual-branch aggregation module
Locally Aware Enhanced Filter reduces redundancy
Feature Modulation Interaction Aggregation Module
💼 Related Jobs
Vision Foundation Model Research Intern
Intrinsic
Salary Range$57.69—$57.69 USDAt Intrinsic, we are proud to be an equal opportunity workplace. Employment at Intrinsic is based solely on a person's merit and qualifications directly related to professional competence. Intrinsic does not discriminate against any employee or applicant because of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), or any other basis protected by law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. It is Intrinsic’s policy to comply with all applicable national, state and local laws pertaining to nondiscrimination and equal opportunity.
Mountain View, California / Mountain View (US-MTV), Mountain View, California, United States