Towards Syn-to-Real IQA: A Novel Perspective on Reshaping Synthetic Data Distributions

📅 2026-01-01

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This study addresses the limited generalization of blind image quality assessment (BIQA) models trained on synthetic data to real-world scenarios, a challenge primarily attributed to feature clustering within synthetic data distributions. For the first time, this work identifies the distributional gap between synthetic and real IQA data as a key bottleneck to cross-domain generalization. To bridge this gap, the authors propose a distribution reshaping framework grounded in diversity and redundancy theory, which jointly optimizes the structural distribution of synthetic data through distribution-aware content-diversified upsampling and density-aware redundant-cluster downsampling. Extensive experiments across three cross-domain settings—synthetic-to-real, synthetic-to-algorithmically-generated, and synthetic-to-synthetic—demonstrate consistent and significant performance gains, validating both the effectiveness and broad applicability of the proposed approach.

Technology Category

Application Category

📝 Abstract

Blind Image Quality Assessment (BIQA) has advanced significantly through deep learning, but the scarcity of large-scale labeled datasets remains a challenge. While synthetic data offers a promising solution, models trained on existing synthetic datasets often show limited generalization ability. In this work, we make a key observation that representations learned from synthetic datasets often exhibit a discrete and clustered pattern that hinders regression performance: features of high-quality images cluster around reference images, while those of low-quality images cluster based on distortion types. Our analysis reveals that this issue stems from the distribution of synthetic data rather than model architecture. Consequently, we introduce a novel framework SynDR-IQA, which reshapes synthetic data distribution to enhance BIQA generalization. Based on theoretical derivations of sample diversity and redundancy's impact on generalization error, SynDR-IQA employs two strategies: distribution-aware diverse content upsampling, which enhances visual diversity while preserving content distribution, and density-aware redundant cluster downsampling, which balances samples by reducing the density of densely clustered areas. Extensive experiments across three cross-dataset settings (synthetic-to-authentic, synthetic-to-algorithmic, and synthetic-to-synthetic) demonstrate the effectiveness of our method. The code is available at https://github.com/Li-aobo/SynDR-IQA.

Problem

Research questions and friction points this paper is trying to address.

Blind Image Quality Assessment

Synthetic Data Distribution

Generalization

Feature Clustering

Data Scarcity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic-to-Real IQA

Data Distribution Reshaping

Blind Image Quality Assessment