🤖 AI Summary
This work addresses the significant performance degradation of semantic segmentation on low-quality images, which suffer from ambiguous semantic structures and missing high-frequency details. Existing image restoration methods often fail to recover task-relevant semantic information essential for accurate segmentation. To overcome this limitation, we propose the Restoration-Assisted Semantic Segmentation (RASS) framework, which innovatively leverages segmentation priors to guide the restoration process. RASS integrates a Semantic-Constrained Restoration (SCR) model, cross-attention alignment, and a LoRA-based fusion mechanism to effectively transfer restoration knowledge into the segmentation network. Extensive experiments demonstrate that our approach consistently outperforms state-of-the-art methods on both synthetic and real-world low-quality image datasets, achieving superior performance in both restoration and segmentation tasks. Additionally, we release a newly curated real-scene annotated dataset to facilitate future research.
📝 Abstract
In real-world scenarios, the performance of semantic segmentation often deteriorates when processing low-quality (LQ) images, which may lack clear semantic structures and high-frequency details. Although image restoration techniques offer a promising direction for enhancing degraded visual content, conventional real-world image restoration (Real-IR) models primarily focus on pixel-level fidelity and often fail to recover task-relevant semantic cues, limiting their effectiveness when directly applied to downstream vision tasks. Conversely, existing segmentation models trained on high-quality data lack robustness under real-world degradations. In this paper, we propose Restoration Adaptation for Semantic Segmentation (RASS), which effectively integrates semantic image restoration into the segmentation process, enabling high-quality semantic segmentation on the LQ images directly. Specifically, we first propose a Semantic-Constrained Restoration (SCR) model, which injects segmentation priors into the restoration model by aligning its cross-attention maps with segmentation masks, encouraging semantically faithful image reconstruction. Then, RASS transfers semantic restoration knowledge into segmentation through LoRA-based module merging and task-specific fine-tuning, thereby enhancing the model's robustness to LQ images. To validate the effectiveness of our framework, we construct a real-world LQ image segmentation dataset with high-quality annotations, and conduct extensive experiments on both synthetic and real-world LQ benchmarks. The results show that SCR and RASS significantly outperform state-of-the-art methods in segmentation and restoration tasks. Code, models, and datasets will be available at https://github.com/Ka1Guan/RASS.git.