SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer

📅 2025-03-20

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing image style transfer methods—particularly those built upon CNN or Transformer backbones—suffer from high computational complexity and slow inference due to their reliance on global receptive field modeling. To address this, we propose SaMam, an efficient state space model (SSM)-based framework tailored for arbitrary style transfer. Our key contributions are threefold: (1) a novel style-aware Mamba encoder-decoder architecture; (2) a local enhancement module coupled with a zigzag spatial scanning strategy to mitigate intrinsic SSM limitations—including pixel forgetting, channel redundancy, and spatial discontinuity; and (3) a style-conditioned state space modeling mechanism. Experiments demonstrate that SaMam achieves superior qualitative and quantitative performance over current state-of-the-art methods, while maintaining O(N) linear time complexity. It simultaneously improves style fidelity, content preservation, and inference speed.

Technology Category

Application Category

📝 Abstract

Global effective receptive field plays a crucial role for image style transfer (ST) to obtain high-quality stylized results. However, existing ST backbones (e.g., CNNs and Transformers) suffer huge computational complexity to achieve global receptive fields. Recently, the State Space Model (SSM), especially the improved variant Mamba, has shown great potential for long-range dependency modeling with linear complexity, which offers a approach to resolve the above dilemma. In this paper, we develop a Mamba-based style transfer framework, termed SaMam. Specifically, a mamba encoder is designed to efficiently extract content and style information. In addition, a style-aware mamba decoder is developed to flexibly adapt to various styles. Moreover, to address the problems of local pixel forgetting, channel redundancy and spatial discontinuity of existing SSMs, we introduce both local enhancement and zigzag scan. Qualitative and quantitative results demonstrate that our SaMam outperforms state-of-the-art methods in terms of both accuracy and efficiency.

Problem

Research questions and friction points this paper is trying to address.

Achieves global receptive fields with linear complexity

Addresses local pixel forgetting and spatial discontinuity

Enhances style transfer accuracy and efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba-based framework for style transfer

Local enhancement and zigzag scan techniques

Efficient content and style information extraction

🔎 Similar Papers

StyleShot: A Snapshot on Any Style