Panoptic Pairwise Distortion Graph

📅 2026-04-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

224K/year
🤖 AI Summary
Existing methods for image-pair quality assessment struggle to model the structured relationships of region-level distortions, lacking fine-grained detail and interpretability. This work introduces a novel task—“distortion graphs”—by extending scene graphs to image pairs, where graph structures encode distortion types, severity levels, comparative relations, and quality scores across regions. To support this, we present PandaSet, the first region-level distortion dataset, along with PandaBench, a dedicated benchmark, and propose Panda, a specialized neural architecture that integrates graph modeling, regional visual analysis, and multimodal learning. Experiments demonstrate that Panda significantly outperforms existing approaches on PandaBench. While mainstream multimodal large language models initially exhibit limited performance, their capacity for region-level distortion understanding markedly improves when trained on or prompted with distortion graphs.

Technology Category

Application Category

📝 Abstract
In this work, we introduce a new perspective on comparative image assessment by representing an image pair as a structured composition of its regions. In contrast, existing methods focus on whole image analysis, while implicitly relying on region-level understanding. We extend the intra-image notion of a scene graph to inter-image, and propose a novel task of Distortion Graph (DG). DG treats paired images as a structured topology grounded in regions, and represents dense degradation information such as distortion type, severity, comparison and quality score in a compact interpretable graph structure. To realize the task of learning a distortion graph, we contribute (i) a region-level dataset, PandaSet, (ii) a benchmark suite, PandaBench, with varying region-level difficulty, and (iii) an efficient architecture, Panda, to generate distortion graphs. We demonstrate that PandaBench poses a significant challenge for state-of-the-art multimodal large language models (MLLMs) as they fail to understand region-level degradations even when fed with explicit region cues. We show that training on PandaSet or prompting with DG elicits region-wise distortion understanding, opening a new direction for fine-grained, structured pairwise image assessment.
Problem

Research questions and friction points this paper is trying to address.

image quality assessment
region-level degradation
distortion understanding
pairwise image comparison
structured representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distortion Graph
region-level assessment
structured image comparison
PandaSet
multimodal large language models
🔎 Similar Papers