Intra and Inter Parser-Prompted Transformers for Effective Image Restoration

📅 2025-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Effectively leveraging vision foundation models (VFMs) for feature parsing remains challenging in image restoration. Method: This paper proposes the Parser-Prompted Transformer (PPTformer), a novel Transformer-based architecture integrating IRNet and PPFGNet. It introduces two complementary attention mechanisms: intra-layer IntraPPA for implicit long-range parsing perception, and inter-layer InterPPA for explicit fusion of multi-level VFM semantic features. Additionally, it designs a parsing-prompted pixel-wise gating feed-forward network (PPFG) for fine-grained feature modulation, and incorporates cross-modal cross-attention and VFM feature transfer. Contribution/Results: PPTformer achieves state-of-the-art performance across four challenging image restoration tasks—deraining, defocus deblurring, desnowing, and low-light enhancement—demonstrating significant improvements in reconstructing degraded images while effectively harnessing VFM-derived semantics.

Technology Category

Application Category

📝 Abstract
We propose Intra and Inter Parser-Prompted Transformers (PPTformer) that explore useful features from visual foundation models for image restoration. Specifically, PPTformer contains two parts: an Image Restoration Network (IRNet) for restoring images from degraded observations and a Parser-Prompted Feature Generation Network (PPFGNet) for providing IRNet with reliable parser information to boost restoration. To enhance the integration of the parser within IRNet, we propose Intra Parser-Prompted Attention (IntraPPA) and Inter Parser-Prompted Attention (InterPPA) to implicitly and explicitly learn useful parser features to facilitate restoration. The IntraPPA re-considers cross attention between parser and restoration features, enabling implicit perception of the parser from a long-range and intra-layer perspective. Conversely, the InterPPA initially fuses restoration features with those of the parser, followed by formulating these fused features within an attention mechanism to explicitly perceive parser information. Further, we propose a parser-prompted feed-forward network to guide restoration within pixel-wise gating modulation. Experimental results show that PPTformer achieves state-of-the-art performance on image deraining, defocus deblurring, desnowing, and low-light enhancement.
Problem

Research questions and friction points this paper is trying to address.

Enhances image restoration using parser-prompted transformers.
Integrates intra and inter parser-prompted attention mechanisms.
Achieves state-of-the-art in various image restoration tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Intra and Inter Parser-Prompted Transformers for image restoration
Parser-Prompted Feature Generation Network boosts restoration
IntraPPA and InterPPA enhance parser feature integration
🔎 Similar Papers
No similar papers found.
C
Cong Wang
Shenzhen Campus of Sun Yat-sen University, China; Centre for Advances in Reliability and Safety, Hong Kong; The Hong Kong Polytechnic University, Hong Kong
Jinshan Pan
Jinshan Pan
Nanjing University of Science and Technology
Computer VisionImage ProcessingComputational PhotographyMachine Learning
Liyan Wang
Liyan Wang
PhD candidate at Waseda University
Computational AnalogiesMachine TranslationLanguage ModelingNatural Language Processing
W
Wei Wang
Shenzhen Campus of Sun Yat-sen University, China