Quality Assessment and Distortion-aware Saliency Prediction for AI-Generated Omnidirectional Images

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Systematic research on quality assessment and distortion-aware saliency prediction for AI-generated omnidirectional images (AIGODIs) remains lacking. Method: We introduce OHF2024—the first multi-dimensional subjective database for AIGODIs—featuring triple-dimensional quality annotations (realism, naturalness, pleasantness) and distortion-sensitive saliency maps. We propose two task-specific models—BLIP2OIQA for omnidirectional image quality assessment and BLIP2OISal for distortion-aware saliency prediction—both built upon a shared BLIP-2 visual encoder to enable joint modeling and end-to-end visual experience optimization. Additionally, we incorporate a gradient-guided local distortion suppression strategy. Contribution/Results: Both models achieve state-of-the-art performance on AIGODI quality assessment and saliency prediction benchmarks. Optimized images exhibit statistically significant improvements in perceptual quality. The OHF2024 database and source code are publicly released.

Technology Category

Application Category

📝 Abstract
With the rapid advancement of Artificial Intelligence Generated Content (AIGC) techniques, AI generated images (AIGIs) have attracted widespread attention, among which AI generated omnidirectional images (AIGODIs) hold significant potential for Virtual Reality (VR) and Augmented Reality (AR) applications. AI generated omnidirectional images exhibit unique quality issues, however, research on the quality assessment and optimization of AI-generated omnidirectional images is still lacking. To this end, this work first studies the quality assessment and distortion-aware saliency prediction problems for AIGODIs, and further presents a corresponding optimization process. Specifically, we first establish a comprehensive database to reflect human feedback for AI-generated omnidirectionals, termed OHF2024, which includes both subjective quality ratings evaluated from three perspectives and distortion-aware salient regions. Based on the constructed OHF2024 database, we propose two models with shared encoders based on the BLIP-2 model to evaluate the human visual experience and predict distortion-aware saliency for AI-generated omnidirectional images, which are named as BLIP2OIQA and BLIP2OISal, respectively. Finally, based on the proposed models, we present an automatic optimization process that utilizes the predicted visual experience scores and distortion regions to further enhance the visual quality of an AI-generated omnidirectional image. Extensive experiments show that our BLIP2OIQA model and BLIP2OISal model achieve state-of-the-art (SOTA) results in the human visual experience evaluation task and the distortion-aware saliency prediction task for AI generated omnidirectional images, and can be effectively used in the optimization process. The database and codes will be released on https://github.com/IntMeGroup/AIGCOIQA to facilitate future research.
Problem

Research questions and friction points this paper is trying to address.

Assessing quality of AI-generated omnidirectional images for VR/AR
Predicting distortion-aware saliency in AI-generated omnidirectional images
Optimizing visual quality of AI-generated omnidirectional images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Established OHF2024 database for human feedback
Proposed BLIP2OIQA and BLIP2OISal models
Developed automatic optimization for visual quality
🔎 Similar Papers
No similar papers found.
L
Liu Yang
Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Huiyu Duan
Huiyu Duan
Shanghai Jiao Tong University
Multimedia Signal Processing
J
Jiarui Wang
Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
J
Jing Liu
Tianjin University, Tianjin 300072, China
Menghan Hu
Menghan Hu
East China Normal University
Signal processingMedical imagingHyperspectral imagingImage processingAgricultural
X
Xiongkuo Min
Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Guangtao Zhai
Guangtao Zhai
Professor, IEEE Fellow, Shanghai Jiao Tong University
Multimedia Signal ProcessingVisual Quality AssessmentQoEAI EvaluationDisplays
Patrick Le Callet
Patrick Le Callet
Prof. Universite de Nantes, LS2N, Polytech Nantes - Institut Universitaire de France (IUF)
cognitive computing for MMQoEhuman perception and applications in ICT