🤖 AI Summary
Real-world image super-resolution (Real-ISR) suffers from complex, heterogeneous degradation patterns; dense models struggle to adaptively model such variations and lack cross-sample knowledge sharing. Method: We propose MoR, a sparse Mixture-of-Experts (MoE) architecture. Its core innovations include: (i) treating LoRA modules of different ranks as independent experts, forming a rank-level fine-grained expert system; (ii) introducing a CLIP-driven degradation estimation module for text-image semantic alignment and degradation-aware routing; and (iii) designing a degradation-aware load-balancing loss and zero-expert-slot mechanism to dynamically control the number of activated experts. MoR enables efficient, adaptive restoration in a single inference pass. Results: Extensive experiments demonstrate that MoR achieves state-of-the-art performance across multiple Real-ISR benchmarks, significantly improving both reconstruction quality under complex degradations and computational efficiency.
📝 Abstract
The demonstrated success of sparsely-gated Mixture-of-Experts (MoE) architectures, exemplified by models such as DeepSeek and Grok, has motivated researchers to investigate their adaptation to diverse domains. In real-world image super-resolution (Real-ISR), existing approaches mainly rely on fine-tuning pre-trained diffusion models through Low-Rank Adaptation (LoRA) module to reconstruct high-resolution (HR) images. However, these dense Real-ISR models are limited in their ability to adaptively capture the heterogeneous characteristics of complex real-world degraded samples or enable knowledge sharing between inputs under equivalent computational budgets. To address this, we investigate the integration of sparse MoE into Real-ISR and propose a Mixture-of-Ranks (MoR) architecture for single-step image super-resolution. We introduce a fine-grained expert partitioning strategy that treats each rank in LoRA as an independent expert. This design enables flexible knowledge recombination while isolating fixed-position ranks as shared experts to preserve common-sense features and minimize routing redundancy. Furthermore, we develop a degradation estimation module leveraging CLIP embeddings and predefined positive-negative text pairs to compute relative degradation scores, dynamically guiding expert activation. To better accommodate varying sample complexities, we incorporate zero-expert slots and propose a degradation-aware load-balancing loss, which dynamically adjusts the number of active experts based on degradation severity, ensuring optimal computational resource allocation. Comprehensive experiments validate our framework's effectiveness and state-of-the-art performance.