🤖 AI Summary
Existing no-reference image quality assessment (NR-IQA) methods struggle to model high-order nonlinear interactions between image content and distortions, resulting in limited consistency with human visual perception. To address this, we propose a hierarchical content-distortion high-order interaction modeling framework. Our approach introduces the Progressive Perception Interaction Module (PPIM), the first module designed to jointly characterize multi-granularity effects of content and distortions under both independent and coupled conditions. Multiple PPIMs are stacked to enable stable and comprehensive hierarchical feature interaction. The framework further integrates multi-scale feature extraction and a customized stability-aware training strategy. Extensive experiments on mainstream NR-IQA benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches: it achieves higher prediction accuracy, requires fewer training samples, and exhibits superior generalization across diverse distortion types.
📝 Abstract
The content and distortion are widely recognized as the two primary factors affecting the visual quality of an image. While existing No-Reference Image Quality Assessment (NR-IQA) methods have modeled these factors, they fail to capture the complex interactions between content and distortions. This shortfall impairs their ability to accurately perceive quality. To confront this, we analyze the key properties required for interaction modeling and propose a robust NR-IQA approach termed CoDI-IQA (Content-Distortion high-order Interaction for NR-IQA), which aggregates local distortion and global content features within a hierarchical interaction framework. Specifically, a Progressive Perception Interaction Module (PPIM) is proposed to explicitly simulate how content and distortions independently and jointly influence image quality. By integrating internal interaction, coarse interaction, and fine interaction, it achieves high-order interaction modeling that allows the model to properly represent the underlying interaction patterns. To ensure sufficient interaction, multiple PPIMs are employed to hierarchically fuse multi-level content and distortion features at different granularities. We also tailor a training strategy suited for CoDI-IQA to maintain interaction stability. Extensive experiments demonstrate that the proposed method notably outperforms the state-of-the-art methods in terms of prediction accuracy, data efficiency, and generalization ability.