Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing

📅 2025-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational overhead and model bloat of real-time demosaicing for HybridEVS sensor Quad Bayer images on edge devices, this paper proposes a lightweight binarized Mamba-Transformer hybrid architecture. Our key contributions are: (1) the first binarized Mamba (Bi-Mamba) module, which preserves full-precision Selective Scan operations while incorporating global visual modeling to mitigate accuracy degradation; and (2) the first co-binarization of Mamba and Swin Transformer, enabling efficient long-range dependency modeling. Leveraging quantization-aware training and hybrid architectural design, the model achieves <1 MB size, reduces FLOPs by 87%, and accelerates inference by 3.2×, attaining real-time frame rates on mobile platforms—setting a new state-of-the-art in both efficiency and performance.

Technology Category

Application Category

📝 Abstract
Quad Bayer demosaicing is the central challenge for enabling the widespread application of Hybrid Event-based Vision Sensors (HybridEVS). Although existing learning-based methods that leverage long-range dependency modeling have achieved promising results, their complexity severely limits deployment on mobile devices for real-world applications. To address these limitations, we propose a lightweight Mamba-based binary neural network designed for efficient and high-performing demosaicing of HybridEVS RAW images. First, to effectively capture both global and local dependencies, we introduce a hybrid Binarized Mamba-Transformer architecture that combines the strengths of the Mamba and Swin Transformer architectures. Next, to significantly reduce computational complexity, we propose a binarized Mamba (Bi-Mamba), which binarizes all projections while retaining the core Selective Scan in full precision. Bi-Mamba also incorporates additional global visual information to enhance global context and mitigate precision loss. We conduct quantitative and qualitative experiments to demonstrate the effectiveness of BMTNet in both performance and computational efficiency, providing a lightweight demosaicing solution suited for real-world edge devices. Our codes and models are available at https://github.com/Clausy9/BMTNet.
Problem

Research questions and friction points this paper is trying to address.

Lightweight demosaicing for HybridEVS RAW images
Reducing computational complexity for mobile deployment
Enhancing global and local dependency modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Binarized Mamba-Transformer for lightweight demosaicing
Hybrid architecture combining Mamba and Swin Transformer
Binarized Mamba reduces computational complexity effectively
🔎 Similar Papers
No similar papers found.
S
Shiyang Zhou
Harbin Institute of Technology, Shenzhen
H
Haijin Zeng
Harvard University
Y
Yunfan Lu
Hong Kong University of Science and Technology, Guangzhou
Tong Shao
Tong Shao
Dolby Laboratories, Inc.
Deep LearningComputer VisionVideo/Image Coding
K
Ke Tang
Harbin Institute of Technology, Shenzhen
Y
Yongyong Chen
Harbin Institute of Technology, Shenzhen
J
Jie Liu
Harbin Institute of Technology, Shenzhen
Jingyong Su
Jingyong Su
Professor, Harbin Institute of Technology at Shenzhen, China
Computer Vision and MultimodalData-Centric MLMedical Image AnalysisStatistics on Manifold