Exploring Non-Local Spatial-Angular Correlations with a Hybrid Mamba-Transformer Framework for Light Field Super-Resolution

📅 2025-09-05

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address feature redundancy from multi-directional scanning and limited state-space modeling capacity—particularly in preserving spatial-angular coupling and disparity information—in Mamba-based light field super-resolution, this paper proposes LFMT, a hybrid Mamba-Transformer framework. Methodologically, LFMT replaces conventional multi-directional scanning with a subspace simple scanning strategy and introduces a two-stage modeling mechanism: shallow layers employ subspace Mamba to extract spatial-angular features, while deep layers integrate epipolar-plane Mamba and Transformer blocks to jointly model non-local epipolar-plane correlations. This design overcomes the representational bottleneck of pure state-space models in capturing disparity structures. Experiments demonstrate that LFMT consistently outperforms state-of-the-art methods on both synthetic and real-world light field datasets, achieving significant performance gains under low computational overhead.

Technology Category

Application Category

📝 Abstract

Recently, Mamba-based methods, with its advantage in long-range information modeling and linear complexity, have shown great potential in optimizing both computational cost and performance of light field image super-resolution (LFSR). However, current multi-directional scanning strategies lead to inefficient and redundant feature extraction when applied to complex LF data. To overcome this challenge, we propose a Subspace Simple Scanning (Sub-SS) strategy, based on which we design the Subspace Simple Mamba Block (SSMB) to achieve more efficient and precise feature extraction. Furthermore, we propose a dual-stage modeling strategy to address the limitation of state space in preserving spatial-angular and disparity information, thereby enabling a more comprehensive exploration of non-local spatial-angular correlations. Specifically, in stage I, we introduce the Spatial-Angular Residual Subspace Mamba Block (SA-RSMB) for shallow spatial-angular feature extraction; in stage II, we use a dual-branch parallel structure combining the Epipolar Plane Mamba Block (EPMB) and Epipolar Plane Transformer Block (EPTB) for deep epipolar feature refinement. Building upon meticulously designed modules and strategies, we introduce a hybrid Mamba-Transformer framework, termed LFMT. LFMT integrates the strengths of Mamba and Transformer models for LFSR, enabling comprehensive information exploration across spatial, angular, and epipolar-plane domains. Experimental results demonstrate that LFMT significantly outperforms current state-of-the-art methods in LFSR, achieving substantial improvements in performance while maintaining low computational complexity on both real-word and synthetic LF datasets.

Problem

Research questions and friction points this paper is trying to address.

Inefficient feature extraction in light field super-resolution using Mamba methods

Limitation of state space in preserving spatial-angular correlation information

Need for comprehensive exploration of non-local spatial-angular correlations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Subspace Simple Scanning strategy for efficient feature extraction

Dual-stage modeling for spatial-angular and disparity information preservation

Hybrid Mamba-Transformer framework combining both architectures' strengths

🔎 Similar Papers

Beyond Subspace Isolation: Many-to-Many Transformer for Light Field Image Super-resolution