🤖 AI Summary
To address feature redundancy from multi-directional scanning and limited state-space modeling capacity—particularly in preserving spatial-angular coupling and disparity information—in Mamba-based light field super-resolution, this paper proposes LFMT, a hybrid Mamba-Transformer framework. Methodologically, LFMT replaces conventional multi-directional scanning with a subspace simple scanning strategy and introduces a two-stage modeling mechanism: shallow layers employ subspace Mamba to extract spatial-angular features, while deep layers integrate epipolar-plane Mamba and Transformer blocks to jointly model non-local epipolar-plane correlations. This design overcomes the representational bottleneck of pure state-space models in capturing disparity structures. Experiments demonstrate that LFMT consistently outperforms state-of-the-art methods on both synthetic and real-world light field datasets, achieving significant performance gains under low computational overhead.
📝 Abstract
Recently, Mamba-based methods, with its advantage in long-range information modeling and linear complexity, have shown great potential in optimizing both computational cost and performance of light field image super-resolution (LFSR). However, current multi-directional scanning strategies lead to inefficient and redundant feature extraction when applied to complex LF data. To overcome this challenge, we propose a Subspace Simple Scanning (Sub-SS) strategy, based on which we design the Subspace Simple Mamba Block (SSMB) to achieve more efficient and precise feature extraction. Furthermore, we propose a dual-stage modeling strategy to address the limitation of state space in preserving spatial-angular and disparity information, thereby enabling a more comprehensive exploration of non-local spatial-angular correlations. Specifically, in stage I, we introduce the Spatial-Angular Residual Subspace Mamba Block (SA-RSMB) for shallow spatial-angular feature extraction; in stage II, we use a dual-branch parallel structure combining the Epipolar Plane Mamba Block (EPMB) and Epipolar Plane Transformer Block (EPTB) for deep epipolar feature refinement. Building upon meticulously designed modules and strategies, we introduce a hybrid Mamba-Transformer framework, termed LFMT. LFMT integrates the strengths of Mamba and Transformer models for LFSR, enabling comprehensive information exploration across spatial, angular, and epipolar-plane domains. Experimental results demonstrate that LFMT significantly outperforms current state-of-the-art methods in LFSR, achieving substantial improvements in performance while maintaining low computational complexity on both real-word and synthetic LF datasets.